Python读取复杂电子表格(CSV)数据小技巧一则
创始人
2024-02-27 15:38:47
0

关于CSV格式

逗号分隔值(Comma-Separated Values,CSV,有时也称为字符分隔值,因为分隔字符也可以不是逗号),其文件以纯文本形式存储表格数据(数字和文本)。“CSV”并不是一种单一的、定义明确的格式(尽管RFC 4180有一个被通常使用的定义)。

python中csv模块中定义的函数:

csv.reader(csvfile, dialect=‘excel’, **fmtparams)

返回一个可以遍历csv文件的reader对象。dialect参数可以用于定义一组特定的csv方言参数,是Dialect类的子类或者list_dialects()函数返回的字符串。从csv文件读取的每一行都作为字符串列表返回。除非指定了QUOTE_NONNUMERIC格式选项(在这种情况下,未加引号的字段将转换为浮点数),否则不会执行自动的数据类型转换。

待处理CSV文件

此文件是外部接口提供的文件,由于时间是比较久远的软件,或者,其他原因,内容有些散乱,如下图所示:
在这里插入图片描述
示例数据如下:

"","","","","","","","","","","","","","","","","油品销售明细表","","","","","","","","","","","",""
"","","加油站名称:","","","","广州*********加油站        ","","","","","","","","","","","","","","","","","","","","","",""
"","从:","","2021-10-01 00:00:00","","","","","","","到:","2022-10-01 23:59:59","","","","","","","","","","","","","","","","",""
"","流水号","","","","","","交易时间","","","","","油枪号码","","油品名称","","","油品单价","","体积","","交易金额","","起泵码","","止泵码","","","备注"
"","","","","479171","","","","2021-10-23 16:21:00","","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","58.83","","479.49","","380693.19","","","380752.02",""
"","","","","","259635","","","","2021-10-23 16:32:00","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","60.90","","493.00","","380752.02","","","380812.92",""
"","","","","","259636","","","","2021-10-23 16:34:00","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","56.19","","446.95","","380812.92","","","380869.11",""
"","","","","","479251","","","","2021-10-23 18:30:00","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","70.76","","573.94","","380869.11","","","380939.87",""
"","","","","","86765","","","","2021-10-23 18:35:00","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","44.03","","361.49","","380939.87","","","380983.90",""
"","","","","479289","","","","","2021-10-23 20:11:00","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","6.09","","50.00","","380983.90","","","380989.99",""
"","","","","","86775","","","","2021-10-23 20:30:00","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","49.13","","393.53","","380989.99","","","381039.12",""
"","","","","479309","","","","","2021-10-23 21:23:00","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","24.36","","200.00","","381039.12","","","381063.48",""
"","","","","","479413","","","","2021-10-24 03:29:00","","","","1","","95号 车用汽油(ⅥA)","","","8.21","","33.47","","271.29","","381063.48","","","381096.95",""
"打印时间:2022-10-28","","","","","","","","","","","","","","","","","","","","","","","","","","填表人:","",""
"","流水号","","","交易时间","","油枪号码","","油品名称","","油品单价","","体积","","交易金额","","起泵码","","止泵码","","","备注"
"","","","86814","","2021-10-24 09:47:00","","1","","95号 车用汽油(ⅥA)","","8.21","","52.29","","429.30","","381157.85","","","381210.14",""
"","","259822","","","2021-10-24 09:59:00","","1","","95号 车用汽油(ⅥA)","","8.21","","46.09","","374.90","","381210.14","","","381256.23",""

python使用csv模块解析数据

方法一,是按单元格逐行个性化解析,例如参考上次XLS格式数据处理《Python按单元格读取复杂电子表格(Excel)数据实践》,这个方法,挺麻烦的,发现第二个方法后,过段放弃此方法。

方法二,提取有效数据解析,由于CSV格式数据不跨行,可以逐行剔除空项,而直接取有效数据,代码非常简单,如下所示:

import csv
import pandas as pd# 以读方式打开文件
dat_row = []
with open("油品销售明细202110-202210.CSV", mode="r") as f:    # 基于打开的文件,创建csv.reader实例reader = csv.reader(f)# 逐行获取数据,并输出for row in reader:dat_col = [v for v in row if len(v)>0]n = n + 1if len(dat_col)==9:dat_row.append(dat_col)cols_list = ['流水号', '交易时间', '油枪号码', '油品名称', '油品单价', '体积', '交易金额', '起泵码', '止泵码']
df = pd.DataFrame(dat_row,columns=cols_list)
df.to_csv('detail.csv',encoding='utf_8_sig',index=False)

注:其中,“dat_col = [v for v in row if len(v)>0]”代码是按行,过滤没有数据的单元格。

小结

对于没有合并单元格(此处为跨行)的数据文件解析,使用适当的方法还是很简单的,非常喜欢简单的方法!

参考:

快乐江小鱼. Python基础 - csv文件格式. CSDN博客. 2022.08
肖永威. 《Python按单元格读取复杂电子表格(Excel)数据实践》. CSDN博客. 2022.11

相关内容

热门资讯

常用商务英语口语   商务英语是以适应职场生活的语言要求为目的,内容涉及到商务活动的方方面面。下面是小编收集的常用商务...
六年级上册英语第一单元练习题   一、根据要求写单词。  1.dry(反义词)__________________  2.writ...
复活节英文怎么说 复活节英文怎么说?复活节的英语翻译是什么?复活节:Easter;"Easter,anniversar...
2008年北京奥运会主题曲 2008年北京奥运会(第29届夏季奥林匹克运动会),2008年8月8日到2008年8月24日在中华人...
英语道歉信 英语道歉信15篇  在日常生活中,道歉信的使用频率越来越高,通过道歉信,我们可以更好地解释事情发生的...
六年级英语专题训练(连词成句... 六年级英语专题训练(连词成句30题)  1. have,playhouse,many,I,toy,i...
上班迟到情况说明英语   每个人都或多或少的迟到过那么几次,因为各种原因,可能生病,可能因为交通堵车,可能是因为天气冷,有...
小学英语教学论文 小学英语教学论文范文  引导语:英语教育一直都是每个家长所器重的,那么有关小学英语教学论文要怎么写呢...
英语口语学习必看的方法技巧 英语口语学习必看的方法技巧如何才能说流利的英语? 说外语时,我们主要应做到四件事:理解、回答、提问、...
四级英语作文选:Birth ... 四级英语作文范文选:Birth controlSince the Chinese Governmen...
金融专业英语面试自我介绍 金融专业英语面试自我介绍3篇  金融专业的学生面试时,面试官要求用英语做自我介绍该怎么说。下面是小编...
我的李老师走了四年级英语日记... 我的李老师走了四年级英语日记带翻译  我上了五个学期的小学却换了六任老师,李老师是带我们班最长的语文...
小学三年级英语日记带翻译捡玉... 小学三年级英语日记带翻译捡玉米  今天,我和妈妈去外婆家,外婆家有刚剥的`玉米棒上带有玉米籽,好大的...
七年级英语优秀教学设计 七年级英语优秀教学设计  作为一位兢兢业业的人民教师,常常要写一份优秀的教学设计,教学设计是把教学原...
我的英语老师作文 我的英语老师作文(通用21篇)  在日常生活或是工作学习中,大家都有写作文的经历,对作文很是熟悉吧,...
英语老师教学经验总结 英语老师教学经验总结(通用19篇)  总结是指社会团体、企业单位和个人对某一阶段的学习、工作或其完成...
初一英语暑假作业答案 初一英语暑假作业答案  英语练习一(基础训练)第一题1.D2.H3.E4.F5.I6.A7.J8.C...
大学生的英语演讲稿 大学生的英语演讲稿范文(精选10篇)  使用正确的写作思路书写演讲稿会更加事半功倍。在现实社会中,越...
VOA美国之音英语学习网址 VOA美国之音英语学习推荐网址 美国之音网站已经成为语言学习最重要的资源站点,在互联网上还有若干网站...
商务英语期末试卷 Part I Term Translation (20%)Section A: Translate ...