Design¶
This page describes the problem and the solution in general. What preceded Procpath and why it didn’t solve the problem.
Problem statement¶
On servers and desktops processes have become treelike long ago. For instance, this is a process tree of Chromium browser with few opened tabs:
chromium-browser ...
├─ chromium-browser --type=utility ...
├─ chromium-browser --type=gpu-process ...
│ └─ chromium-browser --type=broker
└─ chromium-browser --type=zygote
└─ chromium-browser --type=zygote
├─ chromium-browser --type=renderer ...
├─ chromium-browser --type=renderer ...
├─ chromium-browser --type=renderer ...
├─ chromium-browser --type=renderer ...
└─ chromium-browser --type=utility ...
On a server environment it can be substituted with a dozen of task queue worker process trees, processes of the connection pool of a database, several web-server process trees or anything-goes in a bunch of Docker containers.
This environment begs some operational questions, point-in-time and temporal. When I have several trees like above, how do I know the (sub)tree’s current resource profile, like total main memory consumption, CPU time and so on? How do I track these profiles in time when, for instance, I suspect a memory leak? How to point other process analysis and introspection tools to these trees?
Existing approaches for outputting a tree’s PIDs include applying bash-fu on
pstree
output [1] or nested pgrep
for shallower cases. procps
(providing top
and ps
) is inadequate for any of above from embracing
process hierarchy to collecting temporal metrics. psmisc
(providing
pstree
) is only good for displaying the hierarchy, and doesn’t
cover any programmatic interaction. htop
is great for interactive
inspection of process trees with its filter and search, but for programmatic
interaction is also useless. glances
has the JSON output feature, but it
doesn’t have process-level granularity…
For process metrics collection alone (given you know the PIDs), sysstat
(providing pidstat
) is likely the only simple solution, which still
requires some ad-hoc scripting [2].
Solution¶
The solution lies in applying the right tool to the job principle.
Represent Procfs [3] processes as a forest structure (a disjoint union of trees).
Expose this structure to queries in a compact tree query language.
Flatten and store a query result in a ubiquitous tabular format allowing for easy sharing and transformation.
A major non-functional requirement here is ease of installation, preferably in
the form of pure-python package. That’s because an ad-hoc investigation may
not allow installing compiler toolchain on the target machine, which discards
psutil
[4] and discourages XML as the tree representation format (as it
would require lxml
for XPath).
Representation is relatively simple. Read all /proc/N/stat
, build the
forest and serialise it as JSON. The ubiquitous tabular form is even simpler –
SQLite!
The step in between is much less obvious. Discarding special graph query languages and focusing on ones targeting JSON the list goes like this. But it’s unfortunately, taking into account the Python implementations, is not about choosing the best requirement match, but about choosing the lesser evil.
JSONPath [5] and its Python port. Informal, regex-based (obscure error messages and edge-cases), what-if-XPath-worked-on-JSON prototype. Most popular non-regex Python implementation are a sequence of forks, none of which supports recursive descent. One grammar-based package would work [6], but its filter expressions are just Python
eval
.JSON Pointer [7]. No recursive descent supported.
JMESPath (AWS
boto
dependency). No recursive descent supported [8].jq
and its Python bindings [9].jq
is a programming language in disguise of JSON transformation CLI tool. Even though there’s lengthy documentation, on occasional usejq
feels very counter-intuitive and requires lot of googling and trial-and-error.
Pondering and playing with these, item 1 and JSONPyth
[6] was the choice.
Filter Python expression syntax can be “jsonified” by the AttrDict
idiom,
and the security concern of eval
is justified by the CLI use cases (and in
some cases being able to write an arbitrary Python expression in a filter can
actually be useful).
Data model¶
procpath query
outputs the root process nodes with all their descendants
into stdout.
[
{
"stat": {"pid": 1, "ppid": 0, ...}
"cmdline": "a root node",
"other_stat_file": ...,
"children": [
{
"cmdline": "cmdline of some process",
"stat": {"pid": 1, "ppid": 323, ...},
"other_stat_file": ...
},
{
"cmdline": "cmdline of another process with children",
"stat": {"pid": 1, "ppid": 324, ...},
"other_stat_file": ...,
"children": [...]
},
...
]
},
{
"stat": {"pid": 2, "ppid": 0, ...},
"cmdline": "another root node",
"other_stat_file": ...,
"children": [...]
},
...
]
When JSONPath query is provided to the command, the output only contains the nodes (or their parts depending on the query) matching the query (i.e. top elements of the list are matching nodes).
When recorded into a SQLite database, schema is inferred from used Procfs
files. The node list is flattened and recorded into the record
table having
the DDL like the following.
CREATE TABLE record (
record_id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
ts REAL NOT NULL,
cmdline TEXT,
stat_pid INTEGER,
stat_comm TEXT,
...
)
Procpath doesn’t pre-processes Procfs data. For instance, rss
is expressed
in pages, utime
in clock ticks and so on. To properly interpret data in
record
table, there’s also meta
table containing the following
key-value records.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|