Develop#982
Conversation
v0.3.11 v0.3.11
black 格式化
this is no need for python3
Tagging version 0.4.0release 0.4.0release
| # add all repo | ||
| ADD ./ /opt/pyspider | ||
| # install seadaka requ | ||
| RUN pip install -r /opt/pyspider/requirements-seadaka.txt |
There was a problem hiding this comment.
Don't your package in base docker image, you could create another Dockerfile that base on this image while install your additional dependency.
binux
left a comment
There was a problem hiding this comment.
Thank you for the patch. Overall looks good, just have some question regarding the behavior change in Elasticsearch. And how did you test it?
| assert db._select(db.__tablename__, "name, age").next() == (None, 16) | ||
| db._delete(db.__tablename__, "id = 1") | ||
| assert [row for row in db._select(db.__tablename__)] == [] | ||
| assert [row for row in db._select(db.__tablename__)] |
There was a problem hiding this comment.
The behavior changed from assert the table to be empty to not empty. Why?
There was a problem hiding this comment.
This is because empty list is False in python.
There was a problem hiding this comment.
assert 是 true 断言。原来的测试是说delete之后列表为空,改过之后变成列表非空了。
| self.es = Elasticsearch(hosts=hosts) | ||
|
|
||
| self.es.indices.create(index=self.index, ignore=400) | ||
| self.es.indices.create(index=self.index) |
There was a problem hiding this comment.
Did API change? I guess it's trying to ignore the IndexAlreadyExistsException error.
https://elasticsearch-py.readthedocs.io/en/7.x/api.html#ignore
There was a problem hiding this comment.
ElasticSearch 是根据 pylint 提示优化的, 这部分做的不好, 没有测试. 我想先停止这次合并请求. 我漏掉了很多测试.
There was a problem hiding this comment.
我对 ES 的使用 不是很熟练. 哪怕是 ElasticSearch client 版本 7.10.0 和 7.17.0 之间也有很多区别. 我觉得直接升级到 最新的版本,并做好测试更好
There was a problem hiding this comment.
Sure, upgrade to the latest version is good choice, just make sure it wouldn't raise error when the index has already created.
| obj.update(kwargs) | ||
| obj['updatetime'] = time.time() | ||
| return self.es.update(index=self.index, doc_type=self.__type__, | ||
| body={'doc': obj}, id=name, refresh=True, ignore=404) |
There was a problem hiding this comment.
Same here, refresh is to make sure the change to project are visible immediately. The updates on projectdb is expected to be low, that while used as database, it's exepcted for user to see the change immediately.
|
|
||
| def get(self, name, fields=None): | ||
| ret = self.es.get(index=self.index, doc_type=self.__type__, id=name, | ||
| _source_include=fields or [], ignore=404) |
There was a problem hiding this comment.
I can't remember, but doesn't the default behavior for ProjectDB is to return None when a project doesn't exists?
| self.es = Elasticsearch(hosts=hosts) | ||
| self.es: Elasticsearch = Elasticsearch(hosts=hosts) | ||
|
|
||
| self.es.indices.create(index=self.index, ignore=400) |
There was a problem hiding this comment.
Wouldn't this cause IndexAlreadyExistsException error when the resultdb are already exists?
| ret = self.es.search(index=self.index, doc_type=self.__type__, | ||
| body={"aggs": {"projects": { | ||
| "terms": {"field": "project"} | ||
| }}}, _source=False) |
There was a problem hiding this comment.
Seems the API changed to source, that this API is meant to return just project names, that the source doesn't matter.
https://elasticsearch-py.readthedocs.io/en/latest/api.html#elasticsearch.Elasticsearch.search
| else: | ||
| for record in self.es.search(index=self.index, doc_type=self.__type__, | ||
| body={'query': {'term': {'project': project}}}, | ||
| _source_include=fields or [], from_=offset, size=limit, |
There was a problem hiding this comment.
Changed to source_includes?
https://elasticsearch-py.readthedocs.io/en/latest/api.html#elasticsearch.Elasticsearch.search
There was a problem hiding this comment.
Ignore this. My bad.
It's still their in 7.10.0
| @@ -67,6 +67,8 @@ | |||
| 'Programming Language :: Python :: 3.5', | |||
There was a problem hiding this comment.
Didn't you drop support for 3.6 and lower?
放弃 3.6 以下版本
There was a problem hiding this comment.
因为 python2 已经停止维护, 3.4, .3.5 比较低的版本如果也维护会消耗大量的精力,而且这些版本使用的人应该比较少了.
很多依赖库也会慢慢放弃低版本的支持, 所以我想放弃低版本的维护.
|
@lusi1990 great work! lets roll this out |
对 3.7,3.8, 3.9 进行适配
升级一些依赖库, 比如 flask
删除一些适配 python2.7 的代码
优化部分代码
调整了代码格式