profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/lopuhin/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.

lopuhin/kaggle-dstl 195

Kaggle DSTL Satellite Imagery Feature Detection

lopuhin/kaggle-dsbowl-2018-dataset-fixes 130

Kaggle Data Science Bowl 2018 dataset fixes

lopuhin/gram-matcher 4

Match lexico-grammer templates with natural language sentences, using pymorphy

lopuhin/kaggle-amazon-2017 4

Solution for Kaggle challange "Planet: Understanding the Amazon from Space"

lopuhin/clj-grantt 1

Planning work described by grantt diagram, according to given constraints

lopuhin/CodeMirror2 1

In-browser code editor

lopuhin/dbfpy 1

porting to python3

lopuhin/dynide 1

Python Dyncamic IDE - test and re-run your code without leaving your favorite editor

issue commentscrapinghub/splash

React pages are not rendered

Same here. Any React page is not being rendered, only the ones which use React to render static content like Gatsby, Next, etc. Any solutions for that?

palle-k

comment created time in 2 days

issue openedscrapinghub/dateparser

odd behaviour with plus sign

If using now + <x> mins is subtracting rather than adding, which is odd and fails intuition

In [17]: dateparser.parse('now', languages=['en']).ctime()
Out[17]: 'Sat May 15 13:07:06 2021'

In [18]: dateparser.parse('now + 10 mins', languages=['en']).ctime()
Out[18]: 'Sat May 15 12:57:13 2021'

However, using the word in after the + sign seems to get the right result, but that also makes the now + unnecessary.

created time in 3 days

pull request commentscrapy-plugins/scrapy-zyte-smartproxy

Use a custom logger instead of the root one

Codecov Report

Merging #101 (c631966) into master (2d58862) will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #101   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            3         3           
  Lines          231       231           
=========================================
  Hits           231       231           
Impacted Files Coverage Δ
scrapy_zyte_smartproxy/middleware.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2d58862...c631966. Read the comment docs.

elacuesta

comment created time in 4 days

PR opened scrapy-plugins/scrapy-zyte-smartproxy

Use a custom logger instead of the root one

Use a custom logger instead of the root one.

+17 -14

0 comment

2 changed files

pr created time in 4 days

issue openedscrapy/w3lib

w3lib.url.safe_url_string incorrectly encode IDNA domain with port

Step to reproduce:

>>> from w3lib.url import safe_url_string
>>> safe_url_string('http://新华网.中国')
'http://xn--xkrr14bows.xn--fiqs8s'
>>> safe_url_string('http://新华网.中国:80')
'http://xn--xkrr14bows.xn--:80-u68dy61b'

safe_url_string('http://新华网.中国:80') expected result:

'http://xn--xkrr14bows.xn--fiqs8s:80'

real result:

'http://xn--xkrr14bows.xn--:80-u68dy61b'

Related code: https://github.com/scrapy/w3lib/blob/ef5c11012a4d56151eee042bea06132481f318e1/w3lib/url.py#L80

netloc = parts.netloc.encode('idna')

Maybe IDNA encoding should be done on hostname rather than netloc.

created time in 5 days

issue commentscrapinghub/splash

Response can`t encode by the right way

some thing wrong in my settings. i use accept coding is "gzip, deflate, br"

triangle959

comment created time in 5 days

issue closedscrapinghub/splash

Response can`t encode by the right way

when i used scrapy and splash to crawl some website, i found sone wrong encoded text image

my version: Scrapy 2.5.0 scrapy-splash 0.7.2

closed time in 5 days

triangle959

pull request commentscrapinghub/dateparser

Consider RETURN_TIME_AS_PERIOD for timestamp times

Hi @onlynone! Thanks for opening this PR.

This definitely should work. Could you add a test that covers this change?

onlynone

comment created time in 5 days

issue openedscrapinghub/splash

Response can`t encode by the right way

when i used scrapy and splash to crawl some website, i found sone wrong encoded text image

my version: Scrapy 2.5.0 scrapy-splash 0.7.2

created time in 5 days

pull request commentscrapinghub/dateparser

Consider RETURN_TIME_AS_PERIOD for timestamp times

Codecov Report

Merging #922 (e424c1a) into master (255c421) will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #922   +/-   ##
=======================================
  Coverage   98.26%   98.26%           
=======================================
  Files         231      231           
  Lines        2597     2597           
=======================================
  Hits         2552     2552           
  Misses         45       45           
Impacted Files Coverage Δ
dateparser/date.py 99.22% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 255c421...e424c1a. Read the comment docs.

onlynone

comment created time in 5 days

issue commentscrapy-plugins/scrapy-jsonrpc

import error

how do I uninstall this package

Usman995

comment created time in 5 days

issue openedscrapy-plugins/scrapy-jsonrpc

import error

since I installed this to my project I've been getting Screenshot 2021-05-12 at 23 08 33

created time in 5 days

PR opened scrapy/scrapy.org

Rsocks.net

Greetings!

We are a proxy sales partner for one of your competitors, with whom we have been working successfully for a long time.

We offer you to use our services on favorable partnership terms. You can test our proxies for free before the cooperation.

Our service has a feature-rich API to run with your resource, also our development team ready to help you integrate.

If you are ready to start integrating or if you have any questions, please contact our technical support on rsocks.net, we'll be in touch!

+3552 -1388

0 comment

102 changed files

pr created time in 5 days

created tagscrapy-plugins/scrapy-zyte-smartproxy

tagv2.0.0

Crawlera middleware for Scrapy

created time in 6 days

push eventscrapy-plugins/scrapy-zyte-smartproxy

geronsv

commit sha 2d5886295f0725fc4d2284b0d0efa6ea4a6f9040

Bump version: 1.7.2 → 2.0.0

view details

push time in 6 days

push eventscrapy-plugins/scrapy-zyte-smartproxy

geronsv

commit sha c0c079a5ff9ff5467dd1678aee6bbf411c23461e

Added X-Crawlera-Client to release notes

view details

Sergey Geron

commit sha ee3007b3b85895bd5954e811ec89d4190db0b2f0

Update news.rst

view details

Sergey Geron

commit sha a1233d52c0331ed242d0b302465ef8b79cc0a6b7

Update news.rst

view details

Sergey Geron

commit sha ccd102e356321a5e7e1f95a769da19c1c3a8a2d9

Added X-Crawlera-Client to release notes (#100)

view details

push time in 6 days

Pull request review commentscrapy-plugins/scrapy-zyte-smartproxy

Added X-Crawlera-Client to release notes

 following backward-incompatible changes: -   The online documentation is moving to     https://scrapy-zyte-smartproxy.readthedocs.io/ +-   New ``X-Crawlera-Client`` header support+

Yes, looks good. Updated.

geronsv

comment created time in 6 days

Pull request review commentscrapy-plugins/scrapy-zyte-smartproxy

Added X-Crawlera-Client to release notes

 following backward-incompatible changes: -   The online documentation is moving to     https://scrapy-zyte-smartproxy.readthedocs.io/ +-   New ``X-Crawlera-Client`` header support+

Adding it to this list makes it look like this was part of the rebranding.

What about adding an independent paragraph in these lines after the note below?

In addition to that, the ``X-Crawlera-Client`` header is now automatically included in all requests.
geronsv

comment created time in 6 days

pull request commentscrapy-plugins/scrapy-zyte-smartproxy

Added X-Crawlera-Client to release notes

Codecov Report

Merging #100 (c0c079a) into master (4c1c214) will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##            master      #100   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            3         3           
  Lines          231       230    -1     
=========================================
- Hits           231       230    -1     
Impacted Files Coverage Δ
scrapy_zyte_smartproxy/middleware.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 4c1c214...c0c079a. Read the comment docs.

geronsv

comment created time in 6 days

push eventscrapy-plugins/scrapy-zyte-smartproxy

Adrián Chaves

commit sha 1cc3cc73ee457cae74061900db0820b0f81896b7

Crawlera → Zyte Smart Proxy Manager

view details

Adrián Chaves

commit sha 7fc1f2dcbc3c0a4cf8b1cc1790f157cec83239d7

Fix tests

view details

Adrián Chaves

commit sha f41db0a158d2f6078877a092a6f1b7dff6b38843

8010 → 8011

view details

Adrián Chaves

commit sha ca186da81d43328bd07a1e0fe6cb40a8649d2ead

Merge remote-tracking branch 'upstream/master' into rebranding

view details

Adrián Chaves

commit sha 6bb0bf9f835c9f48c3d75b7a0ad7750f7a2f24cc

Merge remote-tracking branch 'upstream/master' into rebranding

view details

Sergey Geron

commit sha 4c1c214c2c3a49f1bad36e52a31889362d86eb1a

Crawlera → Zyte Smart Proxy Manager (#97)

view details

push time in 6 days

PR merged scrapy-plugins/scrapy-zyte-smartproxy

Crawlera → Zyte Smart Proxy Manager

Closes #96 (strongly based on it).

+389 -335

3 comments

14 changed files

Gallaecio

pr closed time in 6 days

PR opened moscow-technologies/blockchain-voting

Bump hosted-git-info from 2.7.1 to 2.8.9 in /smart-contracts

Bumps hosted-git-info from 2.7.1 to 2.8.9. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/npm/hosted-git-info/blob/v2.8.9/CHANGELOG.md">hosted-git-info's changelog</a>.</em></p> <blockquote> <h2><a href="https://github.com/npm/hosted-git-info/compare/v2.8.8...v2.8.9">2.8.9</a> (2021-04-07)</h2> <h3>Bug Fixes</h3> <ul> <li>backport regex fix from <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/76">#76</a> (<a href="https://github.com/npm/hosted-git-info/commit/29adfe5">29adfe5</a>), closes <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/84">#84</a></li> </ul> <p><!-- raw HTML omitted --><!-- raw HTML omitted --></p> <h2><a href="https://github.com/npm/hosted-git-info/compare/v2.8.7...v2.8.8">2.8.8</a> (2020-02-29)</h2> <h3>Bug Fixes</h3> <ul> <li><a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/61">#61</a> & <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/65">#65</a> addressing issues w/ url.URL implmentation which regressed node 6 support (<a href="https://github.com/npm/hosted-git-info/commit/5038b18">5038b18</a>), closes <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/66">#66</a></li> </ul> <p><!-- raw HTML omitted --><!-- raw HTML omitted --></p> <h2><a href="https://github.com/npm/hosted-git-info/compare/v2.8.6...v2.8.7">2.8.7</a> (2020-02-26)</h2> <h3>Bug Fixes</h3> <ul> <li>Do not attempt to use url.URL when unavailable (<a href="https://github.com/npm/hosted-git-info/commit/2d0bb66">2d0bb66</a>), closes <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/61">#61</a> <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/62">#62</a></li> <li>Do not pass scp-style URLs to the WhatWG url.URL (<a href="https://github.com/npm/hosted-git-info/commit/f2cdfcf">f2cdfcf</a>), closes <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/60">#60</a></li> </ul> <p><!-- raw HTML omitted --><!-- raw HTML omitted --></p> <h2><a href="https://github.com/npm/hosted-git-info/compare/v2.8.5...v2.8.6">2.8.6</a> (2020-02-25)</h2> <p><!-- raw HTML omitted --><!-- raw HTML omitted --></p> <h2><a href="https://github.com/npm/hosted-git-info/compare/v2.8.4...v2.8.5">2.8.5</a> (2019-10-07)</h2> <h3>Bug Fixes</h3> <ul> <li>updated pathmatch for gitlab (<a href="https://github.com/npm/hosted-git-info/commit/e8325b5">e8325b5</a>), closes <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/51">#51</a></li> <li>updated pathmatch for gitlab (<a href="https://github.com/npm/hosted-git-info/commit/ffe056f">ffe056f</a>)</li> </ul> <p><!-- raw HTML omitted --><!-- raw HTML omitted --></p> <h2><a href="https://github.com/npm/hosted-git-info/compare/v2.8.3...v2.8.4">2.8.4</a> (2019-08-12)</h2> <!-- raw HTML omitted --> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/npm/hosted-git-info/commit/8d4b3697d79bcd89cdb36d1db165e3696c783a01"><code>8d4b369</code></a> chore(release): 2.8.9</li> <li><a href="https://github.com/npm/hosted-git-info/commit/29adfe5ef789784c861b2cdeb15051ec2ba651a7"><code>29adfe5</code></a> fix: backport regex fix from <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/76">#76</a></li> <li><a href="https://github.com/npm/hosted-git-info/commit/afeaefdd86ba9bb5044be3c1554a666d007cf19a"><code>afeaefd</code></a> chore(release): 2.8.8</li> <li><a href="https://github.com/npm/hosted-git-info/commit/5038b1891a61ca3cd7453acbf85d7011fe0086bb"><code>5038b18</code></a> fix: <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/61">#61</a> & <a href="https://github-redirect.dependabot.com/npm/hosted-git-info/issues/65">#65</a> addressing issues w/ url.URL implmentation which regressed nod...</li> <li><a href="https://github.com/npm/hosted-git-info/commit/7440afa859162051c191e55d8ecfaf69a193b026"><code>7440afa</code></a> chore(release): 2.8.7</li> <li><a href="https://github.com/npm/hosted-git-info/commit/2d0bb6615ecb8f9ef1019bc0737aab7f6449641f"><code>2d0bb66</code></a> fix: Do not attempt to use url.URL when unavailable</li> <li><a href="https://github.com/npm/hosted-git-info/commit/f2cdfcf33ad2bd3bd1acdba0326281089f53c5b1"><code>f2cdfcf</code></a> fix: Do not pass scp-style URLs to the WhatWG url.URL</li> <li><a href="https://github.com/npm/hosted-git-info/commit/e1b83df5d9cb1f8bb220352e20565560548d2292"><code>e1b83df</code></a> chore(release): 2.8.6</li> <li><a href="https://github.com/npm/hosted-git-info/commit/ff259a6117c62df488e927820e30bec2f7ee453f"><code>ff259a6</code></a> Ensure passwords in hosted Git URLs are correctly escaped</li> <li><a href="https://github.com/npm/hosted-git-info/commit/624fd6f301dd5a1fd7ad1b333d6f8921a12ff98c"><code>624fd6f</code></a> chore(release): 2.8.5</li> <li>Additional commits viewable in <a href="https://github.com/npm/hosted-git-info/compare/v2.7.1...v2.8.9">compare view</a></li> </ul> </details> <details> <summary>Maintainer changes</summary> <p>This version was pushed to npm by <a href="https://www.npmjs.com/~nlf">nlf</a>, a new releaser for hosted-git-info since your current version.</p> </details> <br />

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


<details> <summary>Dependabot commands and options</summary> <br />

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • @dependabot use these labels will set the current labels as the default for future PRs for this repo and language
  • @dependabot use these reviewers will set the current reviewers as the default for future PRs for this repo and language
  • @dependabot use these assignees will set the current assignees as the default for future PRs for this repo and language
  • @dependabot use this milestone will set the current milestone as the default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the Security Alerts page.

</details>

+3 -3

0 comment

1 changed file

pr created time in 7 days

issue commentscrapy-plugins/scrapy-crawlera-fetch

Log target url when an exception occurs in the spider, in order to facilitate debugging

Interesting, I had seen this before but I thought it would not be a problem after https://github.com/scrapy/scrapy/pull/4632, seems like I was wrong. Thanks for the report, I'll take a look into it. In the meantime, could you try using the provided log formatter?

kalessin

comment created time in 8 days

issue openedscrapy-plugins/scrapy-crawlera-fetch

Log target url when an exception occurs in the spider, in order to facilitate debugging

Example:

https://app.zyte.com/p/515885/7/13/log?line=173

Right now it is not possible to know which is the url that offended the spider logic.

created time in 8 days

delete branch moscow-technologies/blockchain-voting

delete branch : dependabot/npm_and_yarn/smart-contracts/lodash-4.17.19

delete time in 8 days