Skip to content

Commit

Permalink
cEP-0018: Integration of ANTLR in coala
Browse files Browse the repository at this point in the history
Propose integration of ANTLR in coala.

Closes #118
  • Loading branch information
virresh committed May 31, 2018
1 parent 1ec5b92 commit fd2c467
Show file tree
Hide file tree
Showing 2 changed files with 248 additions and 0 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,5 @@ in the `cEP-0000.md` document.
| [cEP-0012](cEP-0012.md) | coala's Command Line Interface | This cEP describes the design of coala's Command Line Interface (the action selection screen). |
| [cEP-0013](cEP-0013.md) | Cobot Enhancement and porting | This cEP describes about the new features that are to be added to cobot as a part of the [GSoC project](https://summerofcode.withgoogle.com/projects/#4913450777051136). |
| [cEP-0014](cEP-0014.md) | Generate relevant coafile sections | This cEP proposes a framework for coala-quickstart to generate more relevant `.coafile` sections based on project files and user preferences. |
| [cEP-0018](cEP-0018.md) | Integration of ANTLR into coala | This cEP describes how an API based on ANTLR will be constructed and maintained |
| [cEP-0019](cEP-0019.md) | Meta-review System | This cEP describes the details of the process of implementing the meta-review system as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#5188493739819008). |
247 changes: 247 additions & 0 deletions cEP-0018.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,247 @@
# Integration of ANTLR into coala core

| Metadata | |
| -------- | --------------------------------------------- |
| cEP | 18 |
| Version | 1.0 |
| Title | Integration of ANTLR into coala core |
| Authors | Viresh Gupta <mailto:[email protected]> |
| Status | Proposed |
| Type | Feature |

## Abstract

This document describes how an API based on ANTLR will be constructed and
maintained.

## Introduction

ANTLR provides parsers for various language grammars and thus we can provide
an interface from within coala and make it available to bear writers so that
they can write advanced linting bears. This will aim at supporting the more
flexible visitor based method of AST traversal (as opposed to listener based
mechanisms).

The proposal is to introduce a parallel concept to the coala-bears library,
which will be called `coala-antlr` hereon.

## Proposed Change

Here is the detailed implementation stepwise:

1. There will be a separate repository named as `coala-antlr` which will
be installable via `pip install coala-antlr`.
2. A new package would be introduced in the coala-antlr repository,
which would maintain all the dependencies for `antlr-ast` based
bears. This would be the `coantlib` package
3. The `coantlib` package will provide endpoints relevant to the
visitor model of traversing the ast generated via ANTLR.
4. Another package named `coantparsers` will be created inside
`coala-antlr` repository. This package will hold parsers and lexers
generated for various languages generated (with python target) beforehand
for the supported set of grammars.
5. `coantlib` will have an `ASTLoader` class that will be responsible
for loading the AST of a given file depending on the user input, using file
extension as a fallback.
For e.g, a .c file will be loaded using the parser from
`coantlib` for c language if no language is specified.
6. Another `ASTWalker` class will be responsible for providing a single method
to resolve the language supplied by the bear and create a walker instance
of the appropriate type.
7. Also `coantlib.walkers` will have several walker classes derived from
`ASTWalker` class that will be language specific and grammar dependant.
For e.g a `Py3ASTWalker`.
8. The bears dependant on the `coantlib` will reside in the same repository as
the `coantlib`. All of them will be in a different package, the `coantbears`
package, all of them installable together via a single option in the
setup.py along with `coantlib`.
9. The bears will be able to use a high level API by defining the `LANGUAGE`
instance variable, which will return to them a walker with high level API's
in case the language is supported. (for e.g if `LANGUAGE='Python'`, they
will automatically get a `BasePyASTWalker` instance in the walker class
variable)

## Management of the new `coala-antlr` repository

Managing a new repository is a heavy task, and this will be highly automated as
follows:

1. The parsers in `coala-antlr/coantparsers` would be generated and
pushed via CI builds whenever a new commit is pushed.
2. The cib tool can be enhanced to deal with the installation of bears that
require only some specified parsers (for e.g a `PyUnusedVarBear` would
only require parser for python).
3. The cib tool can also trigger specialised builds and download the newly
generated parser on the fly.
4. For adding support for a new language, a new Walker class deriving
`ASTWalker` would be added and the rest will be automatically taken care by
the library.

## Code Samples/Prototypes

Here is a prototype for the implementations within `coantlib`:

```python
import antlr4
import coantparsers
import inspector

from antlr4 import InputStream
from coalib.bearlib.languages.definitions import *
from coalib.bearlib.languages.Language import parse_lang_str, Languages


def resolve_with_package_info(lang):
# Use the inspector methods to search amongst all
# classes that inherit ASTWalker within this module
# and return that class whose supported language matches

class ASTLoader():
mapping = {
C : [coantparsers.clexer, coantparsers.cparser],
Python : [coantparsers.pylexer, coantparsers.pyparser],
JavaScript : [coantparsers.jslexer, coantparsers.jsparser],
...
}

@staticmethod
def loadFile(file,filename,lang = 'auto'):
"""
Loads file's AST into memory
"""
if(lang == 'auto'):
ext = get_extension(filename)
else:
ext = Languages(parse_lang_str(lang)[0])[0]
if ext is None:
raise FileExtensionNotSupported('Unknown File Type')
inputStream = InputStream(''.join(file))
lexer = mapping[ext][0](inputStream)
parser = mapping[ext][1](lexer)
return parser

class ASTWalker():
treeRoot = None
tree = None

@staticmethod
def make_walker():
"""
Resolves walker object for appropriate type
"""
if lang == 'auto':
return ASTWalker
else:
return resolve_with_package_info(lang)

@staticmethod
def get_walker(file, filename, lang):
"""
Returns a walker to the required language
"""
ret_class = make_walker(lang)
if not ret_class:
raise Exception('Un-supported Language')
else:
return ret_class(file, filename)

def __init__(file,filename):
self.tree = ASTLoader.loadFile(file, filename)
self.treeRoot = self.tree
...
```

### Prototype of `Py3ASTWalker`

These kinds of Walkers will supply a high level walker API

```python
class BasePyASTWalker(ASTWalker):
LANGUAGES = { 'Python' }
def next_function():
"""
Modify the tree node variable such that it only stops at functions
which can be detected by the Python grammar rules
"""

def get_imports():
"""
Return list of strings containing import statements from the file
"""

def get_methods():
"""
Return list of strings containing all methods from the file
"""
...

class Py2ASTWalker(BasePyASTWalker):
LANGUAGES = { 'Python 2' }
def get_print_statements():
"""
Return list of print statements, since print is a keyword in py2
"""
...

class Py3ASTWalker(BasePyASTWalker):
LANGUAGES = { 'Python3' }
def get_integer_div_stmts():
"""
Returns a list of integer divide statements, since / and // are
different in py3
"""
...
```

### Prototype of `ASTBear` class implementation

```python
from coantlib import ASTWalker
from coalib.bears.LocalBear import Bear

class ASTBear(LocalBear):
walker = None

def initialise(file, filename):
if LANGUAGES:
self.walker = ASTWalker.get_walker(file,
filename,
self.LANGUAGES[0])
else:
self.walker = ASTWalker.get_walker(file, filename, 'auto')

def run(self,
filename,
file,
tree,
*args,
dependency_results = None,
**kwargs
):
raise NotImplementedError('Needs to be done by bear')
```

### A test bear

```python
from coantlib.ASTBear import ASTBear
from coalib.results.Result import Result

class TestBear(ASTBear):
def run(self,
filename,
file,
tree,
*args,
dependency_results=None,
**kwargs
):
self.initialise(file, filename)
violations = {}
method_list = self.walker.get_methods()
for method in method_list:
# logic for checking naming convention
# and yielding an apt result
# wherever required
...
```

0 comments on commit fd2c467

Please sign in to comment.