Comparison of headers and much more (script)

0

I have two Excel files, which I have passed to txt in order to work with them from pandas.

Initially they have the same number of columns, many columns, although with different data below. The headers are the same, but they may not be in the same order.

At fichero A I manually remove the columns that do not interest me. And now, in fichero B , I want to stay only with the columns that I have stayed in fichero A , that is, eliminate those that are not in fichero A .

In the end, let me check that the headings of both files are the same, and tell me that they are the same and show me the number, for example, to make sure that you have done exactly what I want.

I start to make the script and I get an error. Does anyone come up with something knowing what I want to do?

# -*- coding: utf-8 -*-

import pandas as pd
import sys
from pandas import read_csv


if len(sys.argv) < 3:
    print('Usage: '+sys.argv[0]+' {file_big} {file_small}')
    sys.exit(1)

file_big = sys.argv[1]
file_small = sys.argv[2]

def compare (file_big, file_small):
    data=pd.read_csv(file_big, delimiter="\t")
    data=pd.read_csv(file_small, delimiter="\t")


#contar número de columnas
with file("file_big") as big:
    line = big.readline()
print(len(line.split()), "columns in big")

with file("file_small") as small:
    line = small.readline()
print(len(line.split()), "columns in small")    
    
asked by Juan M 04.11.2016 в 12:25
source

0 answers