题目要求 现有一个文件dictionary.txt
,要求对数据进行处理,分别将英文单词和中文解释保存到mysql数据库,并且可通过输入关键字进行模糊匹配查询。dictionary.txt
文件格式
1 2 3 4 5 6 abandon v.抛弃,放弃 abandonment n.放弃 abbreviation n.缩写 abeyance n.缓办,中止 abide v.遵守 ability n.能力
题目分析
数据库建表,表字段至少包括单词
和含义
。涉及知识点为SQLAlchemy
,主要是创建表。
读取文件并对内容进行处理,然后插入数据库。涉及知识点为文件读写
、字符串处理
、SQLAlchemy
插入数据。
通过关键字进行模糊匹配。涉及知识点为SQLAlchemy
查询。实现 数据库建表 createTable.py
1 2 3 4 5 6 7 8 9 10 11 12 13 from sqlalchemy import create_engine, Column, String, Integerfrom sqlalchemy.ext.declarative import declarative_baseengine = create_engine("mysql+pymysql://root:123456@localhost/test" ) Base = declarative_base() class Dictionary (Base) : __tablename__ = 'dictionary' id = Column(Integer, primary_key=True ) wod = Column(String(50 )) mean = Column(String(50 )) Base.metadata.create_all(engine)
mysql数据库查询
1 2 3 4 5 6 7 8 9 mysql> desc dictionary; + | Field | Type | Null | Key | Default | Extra | + | id | int(11) | NO | PRI | NULL | auto_increment | | key | varchar(50) | YES | | NULL | | | value | varchar(50) | YES | | NULL | | + 3 rows in set (0.00 sec)
数据处理 insertData.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 import codecsfrom sqlalchemy import create_enginefrom sqlalchemy.orm import sessionmakerfrom fuzzyMatching.createTable import Dictionaryengine = create_engine("mysql+pymysql://root:123456@localhost/test?charset=utf8" ) DBsession = sessionmaker(bind=engine) session = DBsession() class HandlerData (object) : def __init__ (self, dataFile) : self.dataFile = dataFile def make_data_to_str (self) : with codecs.open(self.dataFile, encoding='utf-8' ) as file: for (num, value) in enumerate(file): line = value.strip().split() diction = Dictionary(id=num+1 , wod=line[0 ], mean=line[1 ]) session.add(diction) session.commit() handlerData = HandlerData('dictionary.txt' ) handlerData.make_data_to_str() session.close()
mysql数据库查询结果
1 2 3 4 5 6 7 mysql> select count(*) from dictionary; + | count(*) | + | 7988 | + 1 row in set (0.04 sec)
数据模糊匹配 fuzzyMatch.py
1 2 3 4 5 6 7 8 9 10 11 from sqlalchemy import create_enginefrom sqlalchemy.orm import sessionmakerfrom fuzzyMatching.createTable import Dictionaryengine = create_engine("mysql+pymysql://root:123456@localhost/test?charset=utf8" ) DBsession = sessionmaker(bind=engine) session = DBsession() word = input("Please input a word:" ) zero_result = session.query(Dictionary).filter(Dictionary.wod.like("%{0}%" .format(word))).all() for i in zero_result: print(i.id, i.wod, i.mean)
输出结果
1 2 3 4 5 6 7 8 Please input a word:test ]1517 contest n.,v.争夺,比赛 1518 contestant n.参赛人5604 protest v.,n.抗议,反对5605 protestant n.清教徒,新教徒7159 test v.,n.测验,试验7160 testify v.证明,证实7161 testimony n.证据
错误提示
UnicodeEncodeError: ‘latin-1’ codec can’t encode characters in position 18-22: ordinal not in range(256)
查询结果汉字乱码
处理方法 在创建engine时需要添加?charset=utf8
参数