The problem is that the characters |
, /
, \
, :
, ?
, *
, <
, >
and "
are reserved characters not allowed in the name of files / files in Windows.
Since you get the names of a column in your csv, the solution you have left is to replace the non-allowed character, in this case "|" for another that does not present that problem.
For this you can use the pandas.Series.str.replace
method which applies the vectorized replacement on the column:
import pandas as pd
dat = {"trafficChannel": ['152 Media DK $ 2 | 09-11-17',
'153 Media DK $ 2 | 09-11-17']
}
df = pd.DataFrame(dat)
df["trafficChannel"] = df["trafficChannel"].str.replace("|", "%")
Use the character you want to replace, or an empty string to remove them completely. In this case, change by %
:
>>> df
trafficChannel
0 152 Media DK $ 2 % 09-11-17
1 153 Media DK $ 2 % 09-11-17
In your case you can simply do:
import pandas as pd
df = pd.read_csv("All_Data_Tags.csv", header=0, sep = ",")
df["trafficChannel"] = df["trafficChannel"].str.replace("|", "%")
for group in df.groupby(df["trafficChannel"]):
group[1].to_csv("{}.csv".format(group[0]), sep=',', index=False)
You could substitute different characters at once if you think there might be other invalid characters in the column.
Another option is to apply str.replace
in each iteration of for
:
for group in df.groupby(df["trafficChannel"]):
group[1].to_csv("{}.csv".format(group[0].replace("|", "%")),
sep=',', index=False)
in this case for multiple replacements you can use str.translate
.