Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cEP-0018: Integration of ANTLR in coala #124

Merged
merged 1 commit into from
Jun 6, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ in the `cEP-0000.md` document.
| [cEP-0012](cEP-0012.md) | coala's Command Line Interface | This cEP describes the design of coala's Command Line Interface (the action selection screen). |
| [cEP-0013](cEP-0013.md) | Cobot Enhancement and porting | This cEP describes about the new features that are to be added to cobot as a part of the [GSoC project](https://summerofcode.withgoogle.com/projects/#4913450777051136). |
| [cEP-0014](cEP-0014.md) | Generate relevant coafile sections | This cEP proposes a framework for coala-quickstart to generate more relevant `.coafile` sections based on project files and user preferences. |
| [cEP-0018](cEP-0018.md) | Integration of ANTLR into coala | This cEP describes how an API based on ANTLR will be constructed and maintained |
| [cEP-0019](cEP-0019.md) | Meta-review System | This cEP describes the details of the process of implementing the meta-review system as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#5188493739819008). |
| [cEP-0027](cEP-0027.md) | coala Bears Testing API | This cEP describes the implementation process of `BaseTestHelper` class and `GlobalBearTestHelper` class to improve testing API of coala as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#6625036551585792). |
| [cEP-0021](cEP-0021.md) | Profile Bears | This cEP describes the detailed implementation of a profiling interface as a part of the [GSoC'18 project](https://summerofcode.withgoogle.com/projects/#6109762077327360). |
274 changes: 274 additions & 0 deletions cEP-0018.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
# Integration of ANTLR into coala core

| Metadata | |
| -------- | --------------------------------------------- |
| cEP | 18 |
| Version | 1.0 |
| Title | Integration of ANTLR into coala core |
| Authors | Viresh Gupta <mailto:[email protected]> |
| Status | Proposed |
| Type | Feature |

## Abstract

This document describes how an API based on ANTLR will be constructed and
maintained.

## Introduction

ANTLR provides parsers for various language grammars and thus we can provide
an interface from within coala and make it available to bear writers so that
they can write advanced linting bears. This will aim at supporting the more
flexible visitor based method of AST traversal (as opposed to listener based
mechanisms).

The proposal is to introduce a parallel concept to the coala-bears library,
which will be called `coala-antlr` hereon.

## Proposed Change

Here is the detailed implementation stepwise:

1. There will be a separate repository named as `coala-antlr` which will
be installable via `pip install coala-antlr`.
2. A new package would be introduced in the coala-antlr repository,
which would maintain all the dependencies for `antlr-ast` based
bears. This would be the `coantlib` package
3. The `coantlib` package will provide endpoints relevant to the
visitor model of traversing the ast generated via ANTLR.
4. Another package named `coantparsers` will be created inside
`coala-antlr` repository. This package will hold parsers and lexers
generated for various languages generated (with python target) beforehand
for the supported set of grammars.
5. `coantlib` will have an `ASTLoader` class that will be responsible
for loading the AST of a given file depending on the user input, using file
extension as a fallback.
For e.g, a .c file will be loaded using the parser from
`coantlib` for c language if no language is specified.
6. Another `ASTWalker` class will be responsible for providing a single method
to resolve the language supplied by the bear and create a walker instance
of the appropriate type.
7. Also `coantlib.walkers` will have several walker classes derived from
`ASTWalker` class that will be language specific and grammar dependant.
For e.g a `Py3ASTWalker`.
8. The bears dependant on the `coantlib` will reside in the same repository as
the `coantlib`. All of them will be in a different package, the `coantbears`
package, all of them installable together via a single option in the
setup.py along with `coantlib`.
9. The bears will be able to use a high level API by defining the `LANGUAGE`
instance variable, which will return to them a walker with high level API's
in case the language is supported. (for e.g if `LANGUAGE='Python'`, they
will automatically get a `BasePyASTWalker` instance in the walker class
variable)

## Management of the new `coala-antlr` repository

Managing a new repository is a heavy task, and this will be highly automated as
follows:

1. The parsers in `coala-antlr/coantparsers` would be generated and
pushed via CI builds whenever a new commit is pushed.
2. The cib tool can be enhanced to deal with the installation of bears that
require only some specified parsers (for e.g a `PyUnusedVarBear` would
only require parser for python).
3. There will be nightly cron jobs for keeping parsers up-to-date.
4. For adding support for a new language, a new Walker class deriving
`ASTWalker` would be added and the rest will be automatically taken care by
the library.

## Code Samples/Prototypes

Here is a prototype for the implementations within `coantlib`:

```python
import antlr4
import coantparsers
import inspector

from antlr4 import InputStream
from coalib.bearlib.languages.definitions import *
from coalib.bearlib.languages.Language import parse_lang_str, Languages


def resolve_with_package_info(lang):
"""
Use the inspector methods to search amongst all
classes that inherit ASTWalker within this module
and return that class whose supported language matches
"""


class ASTLoader(object):

mapping = {
C: [coantparsers.clexer, coantparsers.cparser],
Python: [coantparsers.pylexer, coantparsers.pyparser],
JavaScript: [coantparsers.jslexer, coantparsers.jsparser],
...
}

@staticmethod
def load_file(file, filename, lang='auto'):
"""
Loads file's AST into memory
"""
if(lang == 'auto'):
ext = get_extension(filename)
else:
ext = Languages(parse_lang_str(lang)[0])[0]
if ext is None:
raise FileExtensionNotSupported('Unknown File Type')
inputStream = InputStream(''.join(file))
lexer = mapping[ext][0](inputStream)
parser = mapping[ext][1](lexer)
return parser


class ASTWalker(object):
treeRoot = None
tree = None

@staticmethod
def make_walker():
"""
Resolves walker object for appropriate type
"""
if lang == 'auto':
return ASTWalker
else:
return resolve_with_package_info(lang)

@staticmethod
def get_walker(file, filename, lang):
"""
Returns a walker to the required language
"""
ret_class = make_walker(lang)
if not ret_class:
raise Exception('Un-supported Language')
else:
return ret_class(file, filename)

def __init__(file, filename):
self.tree = ASTLoader.load_file(file, filename)
self.treeRoot = self.tree
...
```

### Prototype of `Py3ASTWalker`

These kinds of Walkers will supply a high level walker API

```python
"""Prototype of Python ASTWalkers."""


class BasePyASTWalker(ASTWalker):
"""Base class for Python Walkers. This will be an abstract class."""

LANGUAGES = {'Python'}

def get_imports():
"""Return list of strings containing import statements."""

def get_methods():
"""Return list of strings containing all methods from the file."""
...


class Py2ASTWalker(BasePyASTWalker):
"""Python 2 specific walker."""

LANGUAGES = {'Python 2'}

def get_print_statements():
"""Return list of print statements, since print is a keyword in py2."""
...


class Py3ASTWalker(BasePyASTWalker):
"""Python 3 specific walker."""

LANGUAGES = {'Python3'}

def get_integer_div_stmts():
"""
Return a list of integer divide statements.

since / and // are different in py3, this is py3 specific
"""
...
```

### Prototype of `ASTBear` class implementation

```python
"""Prototype of ASTBear class."""

from coantlib import ASTWalker
from coalib.bears.LocalBear import Bear


class ASTBear(LocalBear):
"""
ASTBear class.

This class is similar to Local Bear except for the added capability of
initialising walker for use by the instance of this bear.
"""

walker = None

def initialise(file, filename):
"""Initialise self.walker according to the file and language."""
if LANGUAGES:
self.walker = ASTWalker.get_walker(file,
filename,
self.LANGUAGES[0])
else:
self.walker = ASTWalker.get_walker(file, filename, 'auto')

def run(self,
filename,
file,
tree,
*args,
dependency_results=None,
**kwargs
):
"""Actual run method - to be implemented by bear."""
raise NotImplementedError('Needs to be done by bear')
```

### A test bear

```python
"""A sample Bear."""

from coantlib.ASTBear import ASTBear
from coalib.results.Result import Result


class TestBear(ASTBear):
"""A test bear that uses walker from coantlib for linting."""

def run(self,
filename,
file,
tree,
*args,
dependency_results=None,
**kwargs
):
"""Essence of logic for bear."""
self.initialise(file, filename)
violations = {}
method_list = self.walker.get_methods()
for method in method_list:
"""
logic for checking naming convention
and yielding an apt result
wherever required
"""
...
```