profile
viewpoint

stephan-hof/pyrocksdb 142

Python bindings for RocksDB

stephan-hof/boost_queue 6

C++ Queue template using boost locking. Python wrapper also included

stephan-hof/collectd-beanstalkd-py 5

Gathers statistics from beanstalkd for collectd using beanstalkc

axiros/mod_dechunk 4

Apache module which reads all incoming data into memory to replace 'Transfer-Encoding: chunked' with a Content-Length

stephan-hof/mysql-proxy-in-memory-lru-cache 4

Cache queries directly in mysql-proxy

stephan-hof/playground_fs_fuse 2

In memory filesystem written in python.

stephan-hof/beanstalkd 1

Beanstalk is a simple, fast work queue.

stephan-hof/collectd-mongodb 1

A Collectd plugin to monitor MongoDB (Collectd 4.x)

stephan-hof/collective.recipe.modwsgi 1

zc.buildout recipe for mod_wsgi based deployments

stephan-hof/gevent_aio_linux 1

Experimental python module to hook io_submit and friends into gevent for async file io

startedjermp/pthash

started time in 19 days

issue commenturllib3/urllib3

PoolManager is not thread-safe

@bennylut I would say yes, since you delay the close and let the garbage collector trigger it. This time via __del__ and not weakref.

Just for completeness, this class is not enough. This change is also needed to make it thread safe.

-        self.pools = RecentlyUsedContainer(num_pools, dispose_func=dispose_func)
+        self.pools = RecentlyUsedContainer(num_pools)
reversefold

comment created time in a month

issue commenturllib3/urllib3

PoolManager is not thread-safe

Does THIS issue represent a valid issue with thread-safety in urllib3 or does it not?

I had the same question and did an investigation by looking at the code. I simplified it to the following:

pool_manager = urllib3.PoolManager(num_pools=2)
 
# Thread-1 gets a pool for host 'x'
pool = pool_manager.connection_from_pool_key('x', req_ctx)
< Thread-1 gets scheduled away>

# Thread-2 gets a pool for host 'y'
pool = pool_manager.connection_from_pool_key('y', req_ctx)
< Thread-2 gets scheduled away>

# Thread-3 gets a pool for host 'a'
pool = pool_manager.connection_from_pool_key('a', req_ctx)
< Thread-3 get scheduled away>

# At this stage the PoolManager got a request for a third pool (host 'a').
# However only two are allowed -> pool for host 'x' gets evicted (oldest item in the LRU cache)
# Eviction causes a close of the pool.

# So when Thread-1 continues with a code like this
pool.urlopen()
# An exception raised, because the pool got closed by PoolManager due to eviction.

I basically repeat what @reversefold already said (thanks), just in different words.

Even more simplified: If you have many threads, opening connections to more hosts than configured in num_pools, urllib3 is not safe.

As already correctly analysed by @reversefold (really - you did all the work) the issue is the premature eviction/close of a connection pool by PoolManager.

I experimented (successfully) with the following patch. https://gist.github.com/stephan-hof/c87aefa776779e2bc1dccec649d0d663

The idea is to use a weak reference to handle the close of the connections in the pool. Which means the PoolManager does not need to call close on eviction, because eventually the weakref triggers the close when no thread has a reference to the pool anymore.

I'm fully aware that this diff is not ready for commit and needs polishing, but it illustrates the main idea. If the maintainers of urllib3 show interest in that approach I'm happy to create a proper pull-request.

reversefold

comment created time in a month

more