profile
viewpoint
If you are wondering where the data of this site comes from, please visit https://api.github.com/users/acj/events. GitMemory does not store any data, but only uses NGINX to cache data for a period of time. The idea behind GitMemory is simply to give users a better reading experience.
Adam Jensen acj USA https://acj.sh Pusher of buttons. Usually confused about something.

acj/CastCommander 5

A companion app for OpenCast that controls Google Cast devices

acj/ex_progress 2

A library for tracking progress across many tasks and sub-tasks

acj/FishEyeRemover 2

A simple algorithm (and sample iOS project) for removing the fisheye effect from photos

acj/AirMac 1

server on mac to receive airplay streams from ios devices - original code repo from http://code.google.com/p/airmac

acj/apidata 1

Repository for collaborative editing of API data

acj/debgraph 1

Tool for querying constellations of Debian packages

acj/devcon2012 1

Slides and code for my talk at TSC DevCon 2012

acj/dirmirror 1

A simple tool for mirroring a directory structure (sans files)

acj/go-guardian-pics 1

Go fetcher and parser for the 24 Hours in Pictures feed from the Guardian (#golang)

release dtolnay/cargo-tally

0.3.3

released time in 4 hours

release dtolnay/cargo-tally

0.3.2

released time in 4 hours

startedortuman/jackal

started time in 10 hours

release dtolnay/erased-serde

0.3.16

released time in 11 hours

fork graydon/actions-k3s

Github action for spinning up local k3s instance and running kubectl commands

fork in a day

startedshanedrabing/polyfoto

started time in 2 days

startedZHKKKe/MODNet

started time in 3 days

created repositorygraydon/ordbog

lossy dictionary codes for accelerated scans

created time in 4 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 impl StackTraceGetter {         Ok(self.get_trace_from_current_thread()?)     } -    fn get_trace_from_current_thread(&self) -> Result<StackTrace, MemoryCopyError> {+    #[cfg(any(target_os = "macos", target_os = "windows", target_os = "linux"))]+    fn is_on_cpu_os_specific(&self) -> Result<bool> {+        // remoteprocess crate exposes a Thread.active() method for each of these targets+        for thread in self.process.threads()?.iter() {

@acj this is just not accurate enough - imagine this case: 1 thread in the application is running some non-related C code doing whatever (and it's active, so thread.active() returns true). Meanwhile, another thread is running Ruby code (so it has rb_thread_status_THREAD_RUNNABLE, I guess?). But the OS decided that it's time that the Ruby thread relinquishes CPU, and it gets preempted. Its rb_thread_status remains as it was.

So, per the 2 checks:

  • the "process" is active, because one of its threads is active.
  • the "Ruby thread" is also "active" because its status is THREAD_RUNNABLE.

And the result is that we collect Ruby stacks from a thread that's not currently running!

About the cost of thread.active() - it's not very expensive. It's a single read from /proc/tid/stat, so 3 syscalls (open, read, close). Comparing to get_thread_status - which is only one process_vm_readv IIRC, so faster; but it should be benched & compared to reading /proc/tid/stat if we want a real comparison.

michelhe

comment created time in 4 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 macro_rules! get_stack_trace(             ruby_global_symbols_address_location: Option<usize>,             source: &T,             pid: Pid,-        ) -> Result<StackTrace, MemoryCopyError> {+            on_cpu: bool,+        ) -> Result<Option<StackTrace>, MemoryCopyError> {             let thread: $thread_type = get_execution_context(ruby_current_thread_address_location, ruby_vm_address_location, source)                 .context(ruby_current_thread_address_location)?; +            // testing the thread state in the interpreter.+            if on_cpu && get_thread_status(&thread, source)? != 0 /* THREAD_RUNNABLE */ {

Thanks for the brief Ruby internals :)

I'm not opposed to introducing a --gvl option if we can find a reliable way to detect which thread holds the lock,

I might have missed that - isn't it what we expect THREAD_RUNNABLE to tell? Or does runnable mean "can be run given that it obtains the GVL"? If the latter, then I think that looking at it doesn't make any sense at all.

We might also want an --off-cpu option. Lots of future paths to explore here.

Yup. Well, basically, no --on-cpu means --off-cpu now :)

It seems that this get_thread_status check is the key mechanism, and the earlier check using remoteprocess's Thread::active is a fast/cheap filter to skip get_trace if we know that none of the process's threads is active (from the OS's perspective). Is that accurate?

To my understanding - not precisely, I'll give an elaborated answer on the other discussion :sweat_smile:

michelhe

comment created time in 4 days

started0xd34df00d/refinedt

started time in 5 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 macro_rules! get_stack_trace(             ruby_global_symbols_address_location: Option<usize>,             source: &T,             pid: Pid,-        ) -> Result<StackTrace, MemoryCopyError> {+            on_cpu: bool,+        ) -> Result<Option<StackTrace>, MemoryCopyError> {             let thread: $thread_type = get_execution_context(ruby_current_thread_address_location, ruby_vm_address_location, source)                 .context(ruby_current_thread_address_location)?; +            // testing the thread state in the interpreter.+            if on_cpu && get_thread_status(&thread, source)? != 0 /* THREAD_RUNNABLE */ {

I don't know ruby well enough - is this the equivalent of CPython's GIL? If so, it's probably better to add this as another option (compare to py-spy's --active for on-CPU, and --gil for GIL; 2 different options that can go together, but I might want --active only w/o --gil)

michelhe

comment created time in 6 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 impl StackTraceGetter {         Ok(self.get_trace_from_current_thread()?)     } -    fn get_trace_from_current_thread(&self) -> Result<StackTrace, MemoryCopyError> {+    #[cfg(any(target_os = "macos", target_os = "windows", target_os = "linux"))]+    fn is_on_cpu_os_specific(&self) -> Result<bool> {+        // remoteprocess crate exposes a Thread.active() method for each of these targets+        for thread in self.process.threads()?.iter() {

If the "current" ruby thread is not active OS-wise, it's surely not active ruby-wise (to be precise: it may be marked active ruby-wise, this just means it's preempted). Another thread may be running, doing non-ruby stuff, and the thread doing ruby-stuff may be preempted, and you will incorrectly account that as the ruby code)

michelhe

comment created time in 6 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 impl StackTraceGetter {         Ok(self.get_trace_from_current_thread()?)     } -    fn get_trace_from_current_thread(&self) -> Result<StackTrace, MemoryCopyError> {+    #[cfg(any(target_os = "macos", target_os = "windows", target_os = "linux"))]+    fn is_on_cpu_os_specific(&self) -> Result<bool> {+        // remoteprocess crate exposes a Thread.active() method for each of these targets+        for thread in self.process.threads()?.iter() {

We don't know which OS thread the "current" ruby thread until we start examining the interpreter's memory, so the heuristic is: "if any thread in the thread group is on CPU"

michelhe

comment created time in 6 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 impl StackTraceGetter {         Ok(self.get_trace_from_current_thread()?)     } -    fn get_trace_from_current_thread(&self) -> Result<StackTrace, MemoryCopyError> {+    #[cfg(any(target_os = "macos", target_os = "windows", target_os = "linux"))]+    fn is_on_cpu_os_specific(&self) -> Result<bool> {+        // remoteprocess crate exposes a Thread.active() method for each of these targets+        for thread in self.process.threads()?.iter() {+            if thread.active()? {+                return Ok(true);+            }+        }+        Ok(false)+    }++    #[cfg(not(any(target_os = "macos", target_os = "windows", target_os = "linux")))]+    fn is_on_cpu_os_specific(&self) -> Result<bool> {+        /* We don't have OS specific checks for these targets,+         * so fallback to using the interpreter based check down the line */+        Ok(false)

Nice catch, meant to return true 😅

michelhe

comment created time in 6 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 impl StackTraceGetter {         Ok(self.get_trace_from_current_thread()?)     } -    fn get_trace_from_current_thread(&self) -> Result<StackTrace, MemoryCopyError> {+    #[cfg(any(target_os = "macos", target_os = "windows", target_os = "linux"))]+    fn is_on_cpu_os_specific(&self) -> Result<bool> {+        // remoteprocess crate exposes a Thread.active() method for each of these targets+        for thread in self.process.threads()?.iter() {

This is not the logic you want - you need to check the "current" thread that we're going to get stacktraces from, not just any thread in the process

michelhe

comment created time in 7 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 macro_rules! get_stack_trace(             ruby_global_symbols_address_location: Option<usize>,             source: &T,             pid: Pid,-        ) -> Result<StackTrace, MemoryCopyError> {+            on_cpu: bool,+        ) -> Result<Option<StackTrace>, MemoryCopyError> {             let thread: $thread_type = get_execution_context(ruby_current_thread_address_location, ruby_vm_address_location, source)                 .context(ruby_current_thread_address_location)?; +            let thread_status = get_thread_status(&thread, source)?;

Tiny optimization - these 2 calls require reading into the remote's memory, so you can avoid them unless actually needed (don't need the status if not on_cpu, and don't need thread_id if the on_cpu check doesn't pass)

michelhe

comment created time in 7 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 production**. `rbspy` lets you record profiling data, save the raw profiling data to disk, and then analyze it in a variety of different ways later on. -## only wall-clock profiling+## profiling options  There are 2 main ways to profile code -- you can either profile everything the application does (including waiting), or only profile when the application is using the CPU. -rbspy profiles everything the program does (including waiting) -- there's no-option to just profile when the program is using the CPU.+By defeault, rbspy profiles everything the program does (including waiting).+There is an experimantal option to profile only when the program is using the CPU. (`--on-cpu`)
There is an experimental option to profile only when the program is using the CPU. (`--on-cpu`)
michelhe

comment created time in 7 days

Pull request review commentrbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

 impl StackTraceGetter {         Ok(self.get_trace_from_current_thread()?)     } -    fn get_trace_from_current_thread(&self) -> Result<StackTrace, MemoryCopyError> {+    #[cfg(any(target_os = "macos", target_os = "windows", target_os = "linux"))]+    fn is_on_cpu_os_specific(&self) -> Result<bool> {+        // remoteprocess crate exposes a Thread.active() method for each of these targets+        for thread in self.process.threads()?.iter() {+            if thread.active()? {+                return Ok(true);+            }+        }+        Ok(false)+    }++    #[cfg(not(any(target_os = "macos", target_os = "windows", target_os = "linux")))]+    fn is_on_cpu_os_specific(&self) -> Result<bool> {+        /* We don't have OS specific checks for these targets,+         * so fallback to using the interpreter based check down the line */+        Ok(false)

If you return false here and on_cpu is used, then nothing will be collected at all.

In this configuration - perhaps just deny using on_cpu?

michelhe

comment created time in 7 days

startedawslabs/shuttle

started time in 7 days

issue commentrbspy/rbspy

Coloring flamegraphs?

Since rbspy recognizes c functions now, I think mixed mode would have the most benefit for different colors.

janpio

comment created time in 10 days

startedapple/swift-driver

started time in 10 days

startedgoogle/highway

started time in 11 days

PR opened rbspy/rbspy

Add ability to collect traces only when the CPU is active (--on-cpu)

This change adds on what #127 tried to bring to allow measuring how much time is spent on CPU, (but implemented slightly different and more up to date).

We filter traces that are off-cpu using 2 checks:

  • OS-specific, currently using the remoteprocess crate to determine if there are any active threads at this point in time.
  • Using the rb_thread_struct.status field to figure out if a thread is in THREAD_RUNNABLE or not when we come to get a trace from it. The problem is that this is only a heuristic and will be inaccurate in cases when the process gets preempted by the OS while it has runnable threads, so we rely on OS-specific checks as well when possible.
+220 -39

0 comment

6 changed files

pr created time in 11 days

startedkean/NukeUI

started time in 12 days

startedmchakravarty/CodeEditorView

started time in 12 days

startedaudulus/vger

started time in 14 days

startedmicrosoft/Power-Fx

started time in 16 days