Generate text from machine-learning models right in your browser
alexcg1/jina-wikipedia-sentences 3
Using Jina to search through sentences from English-language Wikipedia
alexcg1/jina-streamlit-frontend 1
A simple front-end for Jina neural search framework, written in Streamlit, that supports querying with image, text, or drawing on a canvas.
Convert mediawiki pages into beautiful PDF books
Personal README
Alex C-G's personal website
repository for OpenSCAD scripts for e-nable project
A curated list of amazingly awesome articles, websites and resources about diversity in technology.
startedredisson/redisson
started time in 5 minutes
startedredis/jedis
started time in 6 minutes
issue commentflorisboard/florisboard
Add multi copy paste (Clipboard history)
Just so you know I've delayed the merge of the input-logic-rework
branch to tomorrow because I've found some ultra nasty bugs in the EditorInstance which get annoying when typing fast. I've fixed most of them but there's still a bug open which currently prevents me from merging the branch.
comment created time in 36 minutes
push eventflorisboard/florisboard
commit sha 058be7a169be7beae88574058f3b77cc3669b452
Fix editor instance commit text logic
push time in 40 minutes
issue openedjina-ai/jina
Serve as the document store backend for Haystack
Describe the feature
Mirroring the ticket: https://github.com/deepset-ai/haystack/issues/82
Your proposal <!-- copy past your code/pull request link -->
<!-- Optional, but really help us locate the problem faster -->
Environment
<!-- Run jina --version-full
and copy paste the output here -->
Screenshots <!-- If applicable, add screenshots to help explain your problem. -->
created time in an hour
push eventjina-ai/jina
commit sha a2b62ff411b4a2269ce7fedf0e2c2d4ea741b15e
chore(contributor): update contributors
push time in 2 hours
Pull request review commentjina-ai/jina
def format(self, record): :param record: A LogRecord object :returns: Formatted LogRecord with level-colour MAPPING to add corresponding colour. """- cr = copy(record)+ cr = deepcopy(record)
Sure. Since the formatter
is used in logger
, which is used extensively. I'm not sure if this will harm the performance, haven't got time to look into it. To my understanding, we should use deepcopy
only when necessary.
And it's good that you can raise this suggestion.
comment created time in 3 hours
push eventjina-ai/jina
commit sha 2f51048c3b2a1c7fb3cf4fe7c4d0fea19581fe5c
chore(style): reformatted by jina-dev-bot
commit sha 4b6af96dec04ac17a9375f44c73ff72ac13c36ec
chore(contributor): update contributors
commit sha cb40b44f05212dbf69f8ef40792d094e51553048
ci: include docstr linter (#2045)
commit sha 3af29051e8e9e35eca5c76c52f86e60df515c834
feat(binarypb): delete on dump (#2102)
commit sha 9bbb0769b0474ddb5a0682b518f41f7e9643ff43
fix: expose env variable for workspace (#2114)
commit sha 7169fb56ad2fa3a919c0225e35fdb0ed08b910e4
fix: fix traversal_path, change from c to r (#2116)
commit sha 640daf4d389768be216dac4125ef4e837ee65d23
ci: add black (#2036) * ci: add black * ci: add git blame
commit sha e01e57df00deda8ea7bbda1f0a26ba25c60782a6
chore(contributor): update contributors
commit sha dc2be2f009b8e82be8241363efd71ce3f32cbf84
ci: reenable docstrings lint (#2118)
commit sha c258e4aa22495d3809ecbcb0ee9966938ccdbfe5
docs: update black docs and sha (#2117)
commit sha 7dd876d0a1fbfca3818c13a68521b80e43a1c617
chore(contributor): update contributors
commit sha 47ac7b0a8d55faf8032579cb6e114e9b02bf392f
chore(version): the next version will be 1.0.9 build(hanxiao): Sunday night weekly patch release
commit sha 3e91a1faf4680e055aff4cf1d6db637024aa5612
chore(contributor): update contributors
commit sha 5e32eddc940b3259fe712d3fffeecc96c8e23afb
Merge remote-tracking branch 'origin/master'
commit sha 666d302ef490d35f7eb080f108994e4582c59dc2
chore(docs): update TOC
commit sha dd687735bb2c569f8dee51ff262d88b3f271b681
refactor(cli): rename silent to quiet (#2122)
commit sha b429d2215475e56a8808b5687db9e90c2d1e133e
feat(schema): generate pydantic based jsonschema for any jina proto (#2121) * feat(schema): genereate pydantic based jsonschema for any jina proto * docs: fix return type * docs: fix docstrings * feat(schema): camel case support for all fields * test(schema): jina document to pydantic document * fix(schema): remove proto name check
commit sha caae3f6d9ba9e29583f08a7d721f8a1629e171fa
refactor: crud delete types (#2014)
commit sha f0b6a44045f7fccca05a34020fd42981ce34dc4e
refactor: prepare changes to have batching for every executor (#2110) Co-authored-by: Nan Wang <nan.wang@jina.ai>
commit sha 24ff01d30f0d6af7ed752d00a545e806108c333b
refactor: merge master
push time in 3 hours
pull request commentjina-ai/cookiecutter-jina
feat: use version matching clause
@ddelange The illustration in this PR is really good and well-organized. Which I learned a lot! Looking forward to your more contributions! ❤️
comment created time in 3 hours
push eventjina-ai/jina
commit sha 282b402b0be2b93dd48e034336bafe9b1829d6aa
fix: test
push time in 3 hours
startednorchen/terraform-talks
started time in 3 hours
Pull request review commentjina-ai/jina
def format(self, record): :param record: A LogRecord object :returns: Formatted LogRecord with level-colour MAPPING to add corresponding colour. """- cr = copy(record)+ cr = deepcopy(record)
Hi, this is a good question. I think deepcopy will definitely be slower than just copy. I need to do more study on the code base to see how much using deepcopy impacts the performance.
comment created time in 3 hours
Pull request review commentjina-ai/jina
test: refactor rank driver test
def create_document_to_score(): # |- matches: (id: 3, parent_id: 1, score.value: 3), # |- matches: (id: 4, parent_id: 1, score.value: 4), # |- matches: (id: 5, parent_id: 1, score.value: 5),- doc = Document()+ doc = jina_pb2.DocumentProto()
Then we need to change other jina_pb2.DocumentProto() to Document() in other tests right?
comment created time in 3 hours
Pull request review commentjina-ai/jina
test: refactor rank driver test
def create_document_to_score(): # |- matches: (id: 3, parent_id: 1, score.value: 3), # |- matches: (id: 4, parent_id: 1, score.value: 4), # |- matches: (id: 5, parent_id: 1, score.value: 5),- doc = Document()+ doc = jina_pb2.DocumentProto()
Ok I know what to do
comment created time in 3 hours
Pull request review commentjina-ai/jina
test: refactor rank driver test
def create_document_to_score(): # |- matches: (id: 3, parent_id: 1, score.value: 3), # |- matches: (id: 4, parent_id: 1, score.value: 4), # |- matches: (id: 5, parent_id: 1, score.value: 5),- doc = Document()+ doc = jina_pb2.DocumentProto()
What's the reason? Then which should we use? Document() is not correct here. Because this line won't set the value. https://github.com/jina-ai/jina/blob/f0b6a44045f7fccca05a34020fd42981ce34dc4e/tests/unit/drivers/rank/test_matches2doc_rank_drivers.py#L61
comment created time in 3 hours
pull request commentjina-ai/jina
test: refactor rank driver test
Latency summary
Current PR yields:
- 😶 index QPS at
1251
, delta to last 3 avg.:+2%
- 😶 query QPS at
20
, delta to last 3 avg.:-2%
Breakdown
Version | Index QPS | Query QPS |
---|---|---|
current | 1251 | 20 |
1.0.7 |
1224 | 20 |
Backed by latency-tracking. Further commits will update this comment.
comment created time in 3 hours
pull request commentjina-ai/jina
test: refactor rank driver test
Codecov Report
Merging #2127 (b0760db) into refactor-rankers (c93f59a) will decrease coverage by
31.81%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## refactor-rankers #2127 +/- ##
=====================================================
- Coverage 82.61% 50.79% -31.82%
=====================================================
Files 208 189 -19
Lines 11182 10516 -666
=====================================================
- Hits 9238 5342 -3896
- Misses 1944 5174 +3230
Flag | Coverage Δ | |
---|---|---|
daemon | ? |
|
jina | 50.79% <ø> (-32.07%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
Impacted Files | Coverage Δ | |
---|---|---|
jina/parsers/ping.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/docker/helper.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/parsers/hub/new.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/parsers/hub/list.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/parsers/hub/build.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/parsers/hub/login.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/parsers/optimizer.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/parsers/hub/pushpull.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/types/request/common.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
jina/types/ndarray/sparse/numpy.py | 0.00% <0.00%> (-100.00%) |
:arrow_down: |
... and 136 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update c93f59a...2a90694. Read the comment docs.
comment created time in 3 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def __init__(self, *args, **kwargs): **kwargs ) - def score(self, query_meta, old_match_scores, match_meta):- new_scores = [- (match_id, -abs(match_meta[match_id]['length'] - query_meta['length']))- for match_id, old_score in old_match_scores.items()- ]- return np.array(- new_scores,- dtype=[(self.COL_MATCH_ID, np.object), (self.COL_SCORE, np.float64)],- )+ def score(self, old_match_scores, query_meta, match_meta):+ new_scores = [-abs(m['length'] - query_meta['length']) for m in match_meta]+ return new_scores def create_document_to_score():
there maybe an error in this test, you are welcome to refactor
The refactor is in this PR, this makes more sense I think. We compare the abs diff between the length of matches with the query. Those with the same value will be sorted by id asending. https://github.com/jina-ai/jina/pull/2127
comment created time in 3 hours
Pull request review commentjina-ai/jina
test: refactor rank driver test
def create_document_to_score(): # |- matches: (id: 3, parent_id: 1, score.value: 3), # |- matches: (id: 4, parent_id: 1, score.value: 4), # |- matches: (id: 5, parent_id: 1, score.value: 5),- doc = Document()+ doc = jina_pb2.DocumentProto()
avoid the usage of Proto
comment created time in 3 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def __init__(self, *args, **kwargs): **kwargs ) - def score(self, query_meta, old_match_scores, match_meta):- new_scores = [- (match_id, -abs(match_meta[match_id]['length'] - query_meta['length']))- for match_id, old_score in old_match_scores.items()- ]- return np.array(- new_scores,- dtype=[(self.COL_MATCH_ID, np.object), (self.COL_SCORE, np.float64)],- )+ def score(self, old_match_scores, query_meta, match_meta):+ new_scores = [-abs(m['length'] - query_meta['length']) for m in match_meta]+ return new_scores def create_document_to_score():
there maybe an error in this test, you are welcome to refactor
comment created time in 4 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def __init__(self, *args, **kwargs): **kwargs ) - def score(self, query_meta, old_match_scores, match_meta):- new_scores = [- (match_id, -abs(match_meta[match_id]['length'] - query_meta['length']))- for match_id, old_score in old_match_scores.items()- ]- return np.array(- new_scores,- dtype=[(self.COL_MATCH_ID, np.object), (self.COL_SCORE, np.float64)],- )+ def score(self, old_match_scores, query_meta, match_meta):+ new_scores = [-abs(m['length'] - query_meta['length']) for m in match_meta]+ return new_scores def create_document_to_score():
The value of the score cannot be set this way? In the test, this value is 0, thus old_matches_socres is [0.0,0.0,0.0,0.0]
comment created time in 4 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def __init__(self, *args, **kwargs): **kwargs ) - def score(self, query_meta, old_match_scores, match_meta):- new_scores = [- (match_id, -abs(match_meta[match_id]['length'] - query_meta['length']))- for match_id, old_score in old_match_scores.items()- ]- return np.array(- new_scores,- dtype=[(self.COL_MATCH_ID, np.object), (self.COL_SCORE, np.float64)],- )+ def score(self, old_match_scores, query_meta, match_meta):+ new_scores = [-abs(m['length'] - query_meta['length']) for m in match_meta]+ return new_scores def create_document_to_score():
In the test, https://github.com/jina-ai/jina/blob/f0b6a44045f7fccca05a34020fd42981ce34dc4e/tests/unit/drivers/rank/test_matches2doc_rank_drivers.py#L61
comment created time in 4 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def __init__(self, *args, **kwargs): **kwargs ) - def score(self, query_meta, old_match_scores, match_meta):- new_scores = [- (match_id, -abs(match_meta[match_id]['length'] - query_meta['length']))- for match_id, old_score in old_match_scores.items()- ]- return np.array(- new_scores,- dtype=[(self.COL_MATCH_ID, np.object), (self.COL_SCORE, np.float64)],- )+ def score(self, old_match_scores, query_meta, match_meta):+ new_scores = [-abs(m['length'] - query_meta['length']) for m in match_meta]+ return new_scores def create_document_to_score():
where is this code?
comment created time in 4 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def __init__(self, *args, **kwargs): **kwargs ) - def score(self, query_meta, old_match_scores, match_meta):- new_scores = [- (match_id, -abs(match_meta[match_id]['length'] - query_meta['length']))- for match_id, old_score in old_match_scores.items()- ]- return np.array(- new_scores,- dtype=[(self.COL_MATCH_ID, np.object), (self.COL_SCORE, np.float64)],- )+ def score(self, old_match_scores, query_meta, match_meta):+ new_scores = [-abs(m['length'] - query_meta['length']) for m in match_meta]+ return new_scores def create_document_to_score():
1.the *20 responds to changes that have been applied to Document ID and continuous refactoring.
- I thinks is by test design, but this test is too complicated to be honest
😂
comment created time in 4 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def __init__(self, *args, **kwargs): **kwargs ) - def score(self, query_meta, old_match_scores, match_meta):- new_scores = [- (match_id, -abs(match_meta[match_id]['length'] - query_meta['length']))- for match_id, old_score in old_match_scores.items()- ]- return np.array(- new_scores,- dtype=[(self.COL_MATCH_ID, np.object), (self.COL_SCORE, np.float64)],- )+ def score(self, old_match_scores, query_meta, match_meta):+ new_scores = [-abs(m['length'] - query_meta['length']) for m in match_meta]+ return new_scores def create_document_to_score():
match.score.value = match_score
this line seems doesn't work?
comment created time in 4 hours
pull request commentjina-ai/jina
refactor: refactor rankers, move logic to driver
what's the motivation for moving methods into drivers?
To have a much simpler interface. Every executor has a very neat and clear interface except Chunk2DocRanker.
Also like this we can hide some complexity about grouping from the executor developer
comment created time in 5 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def _apply_all(self, docs: 'DocumentSet', *args, **kwargs) -> None: ) matches = doc.matches- old_match_scores = {match.id: match.score.value for match in matches}- match_meta = (- {match.id: match.get_attrs(*self._exec_match_keys) for match in matches}- if self._exec_match_keys- else None- )+ num_matches = len(matches)+ old_match_scores = []+ needs_match_meta = self._exec_match_keys is not None+ match_meta = [] if needs_match_meta else None+ for match in matches:+ old_match_scores.append(match.score.value)+ if needs_match_meta:+ match_meta.append(match.get_attrs(*self._exec_match_keys)) # if there are no matches, no need to sort them if not old_match_scores: continue - new_match_scores = self.exec_fn(query_meta, old_match_scores, match_meta)- self._sort_matches_in_place(doc, new_match_scores)+ new_scores = self.exec_fn(old_match_scores, query_meta, match_meta)
yes, we could call it back to nee
comment created time in 5 hours
Pull request review commentjina-ai/jina
refactor: refactor rankers, move logic to driver
def __init__(self, *args, **kwargs): **kwargs ) - def score(self, query_meta, old_match_scores, match_meta):- new_scores = [- (match_id, -abs(match_meta[match_id]['length'] - query_meta['length']))- for match_id, old_score in old_match_scores.items()- ]- return np.array(- new_scores,- dtype=[(self.COL_MATCH_ID, np.object), (self.COL_SCORE, np.float64)],- )+ def score(self, old_match_scores, query_meta, match_meta):+ new_scores = [-abs(m['length'] - query_meta['length']) for m in match_meta]+ return new_scores def create_document_to_score():
1.the *20 responds to changes that have been applied to Document ID and continuous refactoring.
- I thinks is by test design, but this test is too complicated to be honest
comment created time in 5 hours