Skip to main content

How parse XML file Dataset using python

Parse XML file and Store data in CSV file for machine learning Algorithms.

import xml.etree.ElementTree as ET
import os
import csv
path = 'G:\salman'
with open('names.csv', 'a') as csvfile:
    fieldnames = ['pair_id', 'e1', 'e2', 'Sentance']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    for filename in os.listdir(path):
        if not filename.endswith('.xml'): continue
        fullname = os.path.join(path, filename)
        tree = ET.parse(fullname)
        lst = tree.findall('sentence')
        for i in lst:
            i_ = i.findall('pair')
            for elem in i_:
                if elem.attrib['ddi'] == 'true':
                    writer.writerow({'pair_id': elem.attrib['id'], 'e1': elem.attrib['e1'], 'e2': elem.attrib['e2'], 'Sentance': i.attrib["text"]})

Comments

Popular posts from this blog

IP camera access through python

In this tutorial we access IP camera using python. from urllib.request import Request, urlopen import base64 import cv2 import urllib import numpy as np url = 'http://192.168.0.104:8080/shot.jpg' username = '' password = '' while True:     proxy_handler = urllib.request.ProxyHandler({})     opener = urllib.request.build_opener(proxy_handler)     imgResp = Request(url, headers={"User-Agent": "Mozilla/5.0"})     base64string = base64.b64encode(('%s:%s' % (username, password)).encode("utf-8")).decode("utf-8")     imgResp.add_header("Authorization", "Basic %s" % base64string)     r = opener.open(imgResp)     imgNp = np.array(bytearray(r.read()), dtype=np.uint8)     img = cv2.imdecode(imgNp, -1)     cv2.imshow('test', img)     if ord('q') == cv2.waitKey(10):         exit(0)     # all the opencv processing is done here     cv2.imshow('test', img)     if o

Simple linear regression model with scikit-learn

Simple Leaner Regression Model is use to find the relation ship between two variable. It is commonly used in the predict analysis. Suppose we want to know price of pizza on the basis of size. We will train a model on the different size of pizza and its price. Then we will give the size of the pizza to train model it will predict its price. suppose we have different size of pizza x =  [[6], [8], [10], [14], [18]]] and its price y = [[7], [9], [13], [17.5], [18]]. Let's implement this problem it scikit-learn. Firs import Linear Regress from scikit-learn pakage. from sklearn.linear_model import LinearRegression Import Numpy module because when b give the data to model it will only accept if the data in Numpy array. from sklearn.linear_model import LinearRegression import numpy as np Import matplot libraray which use to Draw a plot of our data import matplotlib.pyplot as plt x = [[6], [8], [10], [14], [18]] x = np.reshape(x, (-1, 1)) Her we rehshape array to 2d because it ac