Its easy to read csv files in python into a dataframe. Problem i am encountering is, i have a file with many columns, in the program, i don’t need all the columns at once, i want to process, specific set of columns for one set of logic and use other set of columns somewhere in the program, similarly i had to write many programs. So, when i looked into the code for review, i saw too many pd.read_csv()
Somewhere else,
Like this i have to read many csv files. So, if any input files change, i have to change in many places. So, to cut all these multiple changes, i have created a module(single python file) and added a UDF to read CSV file and pass use_cols as argument.
Now, my code is small and any change to input file, i have to change in only one place.