← Go Back

Ruby GSoC 2020: Final Report

Ruby GSoC 2020: Final Report

Background

Type Checking

Types are very important in most programming languages as they determine what the software does, guarantee that it is meaningful, optimize the use of hardware, and document the intentions of the programmer according to Aarne Ranta, a professor in Chalmers University of Technology. Checking the types of the objects can increase the correctness, reliability, and safety of the software. Ruby is a strongly-typed language, but it is dynamically-typed, which means variables are not restricted to a specific type, and little coercion is done, but the interpreter would still prevent type errors from occurring during runtime. As a result, duck typing, which is "a programming style which does not look at an object’s type to determine if it has the right interface; instead, the method or attribute is simply called or used" according to the official Python glossary, is often used in Ruby as it results in flexibility and polymorphism. For example, the following method can accept any pair of arguments.

def return_sum_as_string(a, b)
  puts "The arguments are #{a} of class #{a.class} and #{b} of #{b.class}"
  (a + b).to_s
end

For example, if you pass the numbers 2 and 3 into the method, it'll return "5". If you made a class that allows its instances to be added to integers and floats, it would return the right answer. However, if you pass arguments that do not support the :+ operation, either a TypeError or a NoMethodError would be returned.

return_sum_as_string(1, 0.0) # => "1"
return_sum_as_string(1, SomeNumericClass('1')) # => "2"; If implemented, the latter can be coerced into an integer.
return_sum_as_string(1, false) # => TypeError; 1 supports :+, but false cannot be coerced into an integer
return_sum_as_string('foo', true) # => TypeError; 'foo' supports :+, but true cannot be coerced into an integer
return_sum_as_string(false, true) # => NoMethodError; false does not support :+
return_sum_as_string(false, 1) # => NoMethodError; false does not support :+

The method would not return any error if the first argument supports the :+ operation that supports the second argument. This includes strings, which could lead to unintended behaviour that could be difficult to debug.

return_sum_as_string('', '1') # => '1', works
return_sum_as_string('1', '1') # => '11'; unintended behaviour
return_sum_as_string(1, 1) # => 11; unintended behaviour

The best way to solve this problem is to convert the arguments to Integers or Floats, but there are other cases where you have to check the classes of the arguments that would be passed or returned. The easiest way to do this is by raising an Exception if the class is not expected.

def foo(a,b,c)
  raise unless a.is_a? Integer
  raise unless b.is_a? SomeClass
  raise unless c.is_a? SomeClass || c.is_a? OtherClass || c.is_a? Integer || c.nil?

  # insert important code (e.g. communicating with a database or a client/server)

  result = ...

  raise unless result.is_a? Hash
  result.each_pair do |k, v|
    raise unless k.is_a? Symbol
    raise unless v.is_a? String
  end

  result
end

This is similar to Design by Contract (DbC) where asserts would be used to verify that the arguments and return value are of the right type, but as you can see, it can clutter the code if you have to write this for many methods, and it can get quite complicated if you have to check the types of a Hash or an Array object, so my project is to provide a way to do what was done concisely.

There are many possible ways to do this. For example, you could use a preprocessor to turn statements (maybe from a comment) into raise unless statements. Although I plan to do something similar to that later on, the approach would be different. There would be a type checker instead that uses RBS and some base to install the hooks through some good old Ruby metaprogramming. The prototype runtime type checker that RBS had uses Module#prepend, but it could run into incompatibility issues, so instead I checked out the TracePoint API and alias_method_chain. I originally used the TracePoint API for my prototype, but I faced a few problems while using it. For example, I found that the TracePoint API would continue to process the return event even if a problem was found with the :call event. After my mentor discussed using the API with the Ruby committers, it was found that it was not a good choice to use as a base since the parameters passed to methods and blocks and the method that was called within a block from the b_call event cannot be identified. Because of this, we decided to use alias_method_chain (AMC). I changed the base of my prototype runtime type checker to AMC while my mentor changed the base of the existing runtime type checker used in RBS to AMC.

My work

Optimizing the Current Type Checker

PRs Involved:

Profiling code

I worked on several PRS that optimized the AMC runtime type checker used in RBS. First, I profiled the RBS against my old project.

This was the report with the runtime type checker:
while this is the report without the runtime type checker:

Sampling option

I checked RBS::Test::TypeCheck#value further since it took up a large chunk of time, and I found out that it takes a long time when a collection is passed. My mentor recommended me to add an option to sample an array, a hash, or an enumerator, so I did just that.

After adding the sampling option, it now spends a less amount of time in the #value method. I then added an option to change the sample size to whatever the user wants

Push comment

After profiling the code again, it was found that it took a long time in #push_comment

So I worked on optimizing the push_comment method by making it a destructive method instead of a non-destructive method.

Improving the Usability of the Runtime Type Checker

While testing RBS on several projects, I faced some problems using it. First, this is how you originally run the runtime type checker:

RUBYOPT='-rbundler/setup -rrbs/test/setup' RBS_TEST_TARGET='WhatIsToBeTargeted,AnotherFile' RBS_TEST_OPT="-I sig" bundle exec ruby file.rb

While running the runtime type checker, I didn't know why it wasn't working since it doesn't say anything. I was then eventually told by my mentor that the string in the RBS_TEST_TARGET variable should not have any space after the commas, so running something like the following would not work.

RUBYOPT='-rbundler/setup -rrbs/test/setup' RBS_TEST_TARGET='WhatIsToBeTargeted, AnotherFile' RBS_TEST_OPT="-I sig" bundle exec ruby file.rb

I then decided to add PRs to allow the use of whitespace in RBS_TEST_TARGET and a warning if no type checkers were installed since I didn't know that time that there weren't any type checkers installed. After this, I added a command called test to make it easier to run the the type checker.

The API for the rbs test is as follows:

 rbs [RBS OPTIONS] test [TEST OPTIONS] COMMAND

RBS OPTIONS are the options that RBS currently uses, such as -I DIR and -r LIB

TEST OPTIONS | Command | Description| | :-: | :-: | | --target \<TARGET> | the module, class, or anything else that would be tested | --sample-size \<SIZE>|ALL | the amount of values in a collection to be type-checked

Improving the Compatibility with testing libraries

Minitest:

I added an API that would allow users to ignore classes, which are commonly from Mock suites, that would cause problems with RBS.

Improving the type checking algorithms

Originally, the type checker doesn't check the Record type, which is like a C struct in Hash form. I eventually added it with this PR.

Finding problems while testing RBS on other projects

There were some issues I found out that were fixed by my mentor. For example, while testing RBS on projects that uses mocks and stubs, I encountered an error.

rbs/test/hook.rb:51:in `alias_method': undefined method `show_usage' for class `#<Class:RubyRush>' (NameError)

Future plans

Although all my PRs have been approved, I would like to continue working on the type checker in the future. I am interested in seeing if it's possible to add type coercion and more.

I am also working on a side project related to RBS and type checkers that converts documentation comments into RDoc and RBS signature files.

Acknowledgments

I would like to thank my mentor @soutaro, who has helped me throughout this past trimester, as well as the org admin. I would like to also thank the whole Ruby organization and Google for giving me this opportunity.