For more on the background behind this project see SOLID or This talk
- Record runtime type information and call site file locations in every method call of your application for most ruby programs
- Ignores library code - i.e. only records method calls defined in your application
- [Not yet] Analysis of type information in Neo4j to:
- show cyclic dependencies
- highlight dependency issues across large file system distances
- suggest potential domain concept file system organisations to simplify app structure
For the following code:
class A
def some_method
B.new.another_method(self) # cs1 e-step1
C.new.method_with_no_more_method_calls # cs3 e-step
D.some_class_method
end
def to_s
"a"
end
def self.boom!
raise "kaboom"
end
end
class D
def self.some_class_method
end
end
# frozen_string_literal: true
class B
def another_method(that)
that.to_s
C.new.method_with_no_more_method_calls # cs2 e-step
C.new.third(self)
end
def cyclic_dependency(s)
s.fourth
end
end
class C
def third(other)
other.cyclic_dependency(self)
end
def fourth
fifth
end
def fifth
sixth
end
def sixth
# execution chain ends here
end
def method_with_no_more_method_calls
end
end
(:CallStack{uuid: "111298a0-2d9c-468f-8e2c-1816a8c01899"})
- [:STEP{number: 1}] ->
(cs:CallSite{file: "file.rb", line_number: 1})
(cs) <- [:CONTAINS] - (m:Method{type:"InstanceMethod"})
(:Class{name: "A"}) - [:CONTAINS] -> (m)
(cs)-[:CALLS]
-> (m2:Method)
<-[:OWNS]-(:Class{name:"B"})
See https://github.com/ruby-analysis/delfos-custom-call-site-logger-example
Only one gem dependency on binding_of_caller
Delfos by default depends upon an optional connection to a Neo4j instance for recording data.
#Gemfile
# Until this gem is a little more stable, then github master will be the best
# place to install from.
gem 'delfos', :git => 'https://github.com/ruby-analysis/delfos.git'
#e.g. in config/initializers/delfos.rb or equivalent
# Delfos affects performance, so we recommend only setting up when required
if defined?(Delfos) && ENV["DELFOS_ENABLED"]
Delfos.start!
end
# Any code defined in the app or lib directories executed after this point will
# automatically have call stacks with type information recorded.
# You could now click around the app or run integration tests to record type
# and callsite information
# When you are finished call from e.g. a console
# This flushes neo4j queries, updates the distance metrics
# and disables Delfos
Delfos.finish!
recording test runs with rspec
RSpec.configure do |c|
c.before(:suite) do
require "delfos"
your_library_path = File.expand_path("../../lib", __FILE__)
Delfos.configure { |c| c.include = your_library_path }
Delfos.start!
end
c.after(:suite) do
Delfos.finish!
end
end
# Example configuration
Delfos.configure do |config|
config.include = ["app"] # default value is ["app", "lib"]
config.include "another_directory"
config.include ["an", "array", "of", "directories", "or", "files.rb"]
config.include "a_file.rb"
config.exclude = ["app/app_config.rb", "app/another.rb"]
config.exclude "app/yet_another.rb"
config.exclude ["exclude", "works" "like", "include"]
config.call_site_logger = CustomCallSiteLogger.new
config.logger = Logger.new
config.batch_size = 1_000
config.logger.max_query_size = 10_000
config.offline_query_saving = false
config.offline_query_filename = nil
end
Delfos.start!
Possible config values:
include=
array or string | Files/directories to method trace. Defaults toapp
andlib
include
array or string | Append files/directories to method tracing.exclude=
array or string | Files/directories to exclude from method tracing. Defaults to emptyexclude
array or string | Append files/directories to exclusion from method tracing.call_site_logger
const | Defaults to recording to neo4j.logger
const | For outputing debug information during method recording.batch_size
int | Default batch size for neo4j query.max_query_size
int | Max string length for neo4j queries before flushing.offline_query_saving
boolean | Outputs call site queries to a local file. Defaults todelfos_cypher_output.cypher
iftrue
.offline_query_filename
string | Define your own file to save offline queries to.
You can supply an object for the call_site_logger
that responds to #log
and #finish
#log
receives the following arguments : (call_site, stack_uuid, stack_step)
Where:
call_site
has the following methods:container_method
,called_method
,file
,line_number
stack_uuid
is a unique identifier for the current call stackstack_step
is the index of this call_site within the current call stackcontainer_method
&called_method
have the following methods defined:file
line_number
object
- self at that point during runtimeclass_method
- boolean
#finish
receives no arguments. Use this to tidy up.
E.g. flush queries/ flush files etc. Perform any extra
calculations after all the data has been gathered, etc.
By default delfos tries to push queries in batches to neo4j during runtime.
This tends to work fine for small projects and for manually interacting with an application,
but starts to fallover when running against a whole test suite.
Whilst, you can tweak the batch size and experiment with sleep
ing in between specs, it is
probably preferable to save the queries offline for later processing.
config.before(:suite) do
Delfos.configure { |c| c.offline_query_saving = true }
Delfos.start!
endt
Then you can run the command line binary to import the data. E.g.
NEO4J_PORT=8001 NEO4J_HOST=localhost delfos_import
This splits the data into files of 10,000 queries each, in ./tmp/delfos
then executes them synchronously against neo4j.
Any queries that generate exceptions are logged in their corresponding error file.
E.g.
cat tmp/delfos/delfos_queries_ab.errors
{"step_number":1,"stack_uuid":"80dfd309-eb42-4ceb-aa85-8c3d6d137954","call_site_file":"spec/bundler/plugin/api/source_spec.rb","call_site_line_number":78,"container_method_klass_name":"RSpec::ExampleGroups::BundlerPluginAPISource::ToLock::WithAdditionalOptionsToLock","container_method_type":"InstanceMethod","container_method_name":"(main)","container_method_file":null,"container_method_line_number":-1,"called_method_klass_name":null,"called_method_type":"InstanceMethod","called_method_name":"to_lock","called_method_file":"lib/bundler/plugin/api/source.rb","called_method_line_number":254}
In general these error files probably represent bugs in Delfos so feel free to raise an issue with replication steps.
If the errors are obvious (e.g. an incorrectly determined method name), you could edit them and re-import with
cp ./tmp/delfos/delfos_queries_ab.errors fixed_query_parameters
NEO4J_PORT=8001 NEO4J_HOST=localhost delfos_import fixed_query_parameters
Delfos also records the distance across the file system. The distance is defined as basically the visual distance in an ordinary filesystem tree view like vim's NERDTree view.
The file system distance is updated as a separate task after all (CallSite)
nodes have been added to the graph. This can be done by calling
Delfos::Neo4j.update_distance!
.
You may do this in an after suite hook like this:
config.after(:suite) do
Delfos.finish!
end
The scoring algorithm asserts that files that traverse a large number of directories to call other files end up with a 'worse' score than files which call files which are alphabetically next to each other in the same directory.
There is also a score recorded for number of possible files traversed. So projects which have large numbers of files per directory are also penalised.
This scoring system is quite likely to change as it is used against more systems to record sample data sets.
Snapshot of the delfos codebase:
▾ lib/
▾ delfos/
▸ distance/
▾ method_logging/
args.rb
args_spec.rb
code_location.rb
code_location_spec.rb
klass_determination.rb
▾ neo4j/
distance_update.rb
distance_update_spec.rb
execution_persistence.rb
informer.rb
informer_spec.rb
query_execution.rb
▸ patching/
common_path.rb
common_path_spec.rb
execution_chain.rb
execution_chain_spec.rb
integration_spec.rb
method_logging.rb
method_logging_spec.rb
patching.rb
patching_spec.rb
patching_unstubbing_spec_helper.rb
perform_patching.rb
perform_patching_spec.rb
remove_patching.rb
version.rb
delfos.rb
delfos_spec.rb
In the example above, if a call site in remove_patching.rb
were to call a
site in common_path.rb
in the same directory it would receives 'penalty'
points for crossing the 11 files in between, but no penalty points for
traversing directories. It would also get penalty points for the 11 possible
file traversals in the directory.
If code_location.rb
were to call neo4j query_execution.rb
it would receive
- penalty points for traversing
args.rb
andargs_spec.rb
- penalty points for the 4 possible file traversals in the directory
Then it would receive
- penalty points for moving up one directory,
- and more penalty points for traversing into the
neo4j
directory - plus penalty points for possible traversals across the 4 directories
Finally it would receive
- penalty points for traversing across the 6 files
- penalty points for the 6 possible traversals
The ordinary points and points for possible traversals are recorded separately. The algorithm is likely to change once analysis is done on the effectiveness of the data.
Following are some ideas of where to take this project next:
I would like to create a UI for visualizing call stacks with their respective file system traversals.
I want to detect common software design mistakes in a way which is useful/actionable like rubocop.
Delfos specs are organized in a similar fashion to golang
tests and follow
the principles outlined by this README. That is that code that changes
together lives together.
So there are no specs in the spec
folder, unit specs live next to their implementation.
E.g.
lib/delfos/neo4j/
distance_update.rb
distance_update_spec.rb
The rake task is setup to handle this default and is equivalent to the following:
NEO4J_HOST=http://localhost NEO4J_PORT=8476 NEO4J_USERNAME=username NEO4J_PASSWORD=password bundle exec rspec lib