Juancarlo Añez, a Portfolio

The usual Curriculum Vitae describes well the roles and responsibilities in each work endeavour, but not so much the work that was done. That’s why I thought it would be good to display some of the work I’ve done over the years in an easy to grok way: a portfolio

Please send any queries or comments by email, by voice to my mobile or office numbers, or through telegram or skype.

The Work

A portfolio should start with a focus on the most recent work.

The Python PEG Parser

On 2019 I collaborated with Guido van Rossum on the design of the project and architecture of a new parser for Python using PEG. The initial prototyping was done with TatSu. Later Guido assembled a team that created a solid implementation so quickly that the new parser was included in Python 3.9 by its release date on October 2020.

Web Crawling

I’ve been crawling the Web since 2017, and I’ve learned some fundamentals:

  • Information has value, so everyone wants to crawl, and everyone else wants to avoid crawling.

  • Math (statistics, data sciences, machine learning) is the key on the niche. Crawlers use it to be human-like, sites use it to spot the lie.

  • As long as human users are able to crawl, bots will too, as long as they behave humanely.

This is code to apply the Multi armed bandit strategy to header-profile selection on a crawling bot, so the choice converges to what the crawled site likes:

def choose(self):
    g = self.profile_randomization
    k = len(self.weight)
    s = sum(self.weight)

    # probabilities
    p = [((1 - g) * w / s) + (g / k) for w in self.weight]

    # re-calculate weights
    for j, _ in enumerate(self.weight):
        m = self.success[j]
        n = self.failure[j]
        x = m / (m + n) if n + m else 0
        expected_reward = (1 - g) * p[j] * x / k

    self.weight = normalized_weights(self.weight, lower=1.0, upper=100.0)
    return weighted_choice(p)

With the library I wrote for session management, ban detection, and challenge resolution, a crawler can solve the challenge posted by a site with code like this:

    def is_challenge_response(self, response):
        return response.css('head script:contains(site-id)::text')

    def make_challenge_request(self, response, request):
        value = self.solve_challenge(response)
        if not value:
            return

        return FormRequest.from_response(
            response,
            formdata={
                'secret': str(value)
            },
        )

The following animation shows directory structure and my commitsfor the project I worked on during 2017 and 2018. The large cluster is is the set of spiders/crawlers. The “satellites” are refactorings of complex crawlers. The long branch consists of contributions to the common libraries.

The Parsers

From 2012 to 2016 I built parsers, analyzers, and translators for the COBOL, Java, M2, Natural, Power Script, SQL, and Visual Basic programming languages. As part of that effort I also wrote the TatSu parser generator (see below).

Those parsers were sold under terms that don’t allow me to publish anything about them, yet, in 2016 I wrote parsers for COBOL, M204, IBM MFS, Natural, Power Script, SQL, and Visual Basic again, clean-room (from scratch).

The following Gource animation shows the commits by the team to the parsers and the front-end webapps during 2016. The parsers are the projects in the center, and the webapps the flower-ended structures on the border. I am the dancing white box with the dragon logo.

The following charts show my commits relative to other members of the team.

  • TatSu

    TatSu

  • COBOL

    COBOL

  • Code Generation Webapp

    Code Generation

  • Deployment Automation

    Deployment Automation

  • Java

    Java

  • Legacy Analysis Webapp

    Legacy Analysis

  • MFS

    MFS

  • Natural

    Natural

  • PowerScript

    PowerScript

  • Commons

    Commons

  • SQL

    SQL

  • Vistual Basic

    Visual Basic

TatSu

During 2012 ANTLR v4 was still a work in progress, and I had great inconvenience trying to make ANTLR v3 parse throught the ambiguous constructs of programming languages such as COBOL and Natural. I convinced my boss of developing our own parser generator which, though simple, was able to handle the ambiguities without all the work ANTLR was requiring.

After forty continuous days of work beginning in mid December 2012, Grako was published as open-source on February 2013. Grako later became TatSu. This is the Google Analytics map of visits to the TatSu’s pages.

TatSu Sessions Worldwide

This is a fragment of the TatSu grammar for Java:

method_declaration_mixin
    =

    type_parameters:type_parameters
    type:type name:identifier
    parameters:formal_parameters
    dimensions:dimensions
    ['throws' throws:qualified_name_list]
    [
        | ';'  # for abstract and interface methods
        | block:block [';' ~ ]  # a trailing ;
    ]
    ;

method_declaration::MethodDeclaration
    =
    modifiers:{modifier}
    >method_declaration_mixin
    ;


interface_method_declaration::MethodDeclaration
    =
    modifiers:{interface_method_modifier}
    >method_declaration_mixin
    ;

And this is part of the corresponding Python code generated for the parser:

    @graken()
    def _method_declaration_mixin_(self):
        self._type_parameters_()
        self.name_last_node('type_parameters')
        self._type_()
        self.name_last_node('type')
        self._identifier_()
        self.name_last_node('name')
        self._formal_parameters_()
        self.name_last_node('parameters')
        self._dimensions_()
        self.name_last_node('dimensions')
        with self._optional():
            self._token('throws')
            self._qualified_name_list_()
            self.name_last_node('throws')
        with self._optional():
            with self._choice():
                with self._option():
                    self._token(';')
                with self._option():
                    self._block_()
                    self.name_last_node('block')
                    with self._optional():
                        self._token(';')
                        self._cut()
                self._error('expecting one of: ;')
        self.ast._define(
            ['block', 'dimensions', 'name', 'parameters', 'throws', 'type', 'type_parameters'],
            []
        )

The Practice

Continuous study and practice is the big secret to all important achievements.

Exercism

I have been completing the Python, Swift, and Go tracks at the Exercism web site as a way to practice and obtain reviews from some of my peers.

My solutions to the problems can be found here. This is a sample of the kind of solutions I wrote:

Solution to Rail Fence Exercism

Project Euler

I wrote about Python in 2000, and started using it for projects in 2011, when it became obvious the programming language would become mainstream and important. To gain experience, among other efforts, I solved the problems at the Project Euler web site. My solutions to >50 of the problems are on this repository.

This is a sample:

#!/usr/bin/env python
"""
Solution to Project Euler Problem 24
http://projecteuler.net/

by Apalala <apalala@gmail.com>
(cc) Attribution-ShareAlike
http://creativecommons.org/licenses/by-sa/3.0/

A permutation is an ordered arrangement of objects. For example, 3124 is one
possible permutation of the digits 1, 2, 3 and 4. If all of the permutations
are listed numerically or alphabetically, we call it lexicographic order. The
lexicographic permutations of 0, 1 and 2 are:

012   021   102   120   201   210

What is the millionth lexicographic permutation of the digits 0, 1, 2, 3, 4,
5, 6, 7, 8 and 9?
"""
from itertools import permutations
from itertools import islice


def nth_element(iterable, n):
    return next(islice(iterable, n - 1, n))


def nth_permutation(digits, n):
    return ''.join(nth_element(permutations(digits), n))


def test():
    assert '210' == nth_permutation('012', 6)


if __name__ == '__main__':
    test()
    print(nth_permutation('0123456789', 10 ** 6))

The Story

Experience takes work… and time. This section is about some of the things I’ve been involved in since 1985.

I’ve always been good at designing user interfaces, but I’ve kept away from it for some years in the hopes that the technologies behind WUI stabilize a bit (and they haven’t). The main goal of a UI should be to present information in a way that allows users to obtain meaning from it.

DUnit

I was the original developer of DUnit, the library that became the official testing framework for Embarcadero Delphi (previously Borland Delphi) until it was replaced by DunitX.

DUnit User Interface

TRANUS

I was co-owner, CTO, and main programmer in Modelistica from 1985 to 2005. During that period I made important contributions to the very successful TRANUS urban and regional simulation system, and I developed its user interface, to make the lives of the modelers more enjoyable.

The original model was developed using Fortran, and the UI was first in Visual Basic and then in Borland Delphi.

The image below is of TRANUS displaying the road types for the city of Recife in Brazil.

City of Recife, Brazil, in TRANUS

To be able to account for forbidden turns and turn delays at intersections, the internals of the model worked with the dual of the network graph. This is the FORTRAN code for finding a minimum path in the dual graph:

subroutine FindPath(i, j, astar, path, pathSize, cost, status)
    real      astar(*)
    integer   path(*)
    integer   pathSize
    integer   status

    integer(4) :: IHMAX = 0

    real    :: xcmin(NLINKRUT)
    integer :: xpmin(NLINKRUT)
    real    :: cosReal(NLINKRUT)

    integer   l, lnrCan, irfte, iofte,lrc
    double precision ::    ctransf,costocan, MaxCostSeen, pcost
    integer   nh, lnr, lnrAbs, ircan
    logical(1) :: mark(NLINKRUT)
    integer(4) :: inode, cnode

177 format(/,' N', I6,' Or',I6,'  Des',I6,'  T',I6,'  R',I6,'  C',2F12.5)

    xcmin = RINF
    xpmin = 0
    cosReal = RINF

    call heapalloc(MXCON*NLINKRUT)
    inode = matz(i)
    do while(graph(inode)%linkrut /= 0)
        pcost = node_ovr_cost(inode)
        lnr = abs(graph(inode)%linkrut)
        xcmin(lnr) = pcost
        cosReal(lnr) = node_cost(inode)
        call heappush(pcost + astar(lnr), inode)
        inode = inode+1
    end do

    status = msg_NoPath
    mark =.FALSE.
    MaxCostSeen=0
    nhmax   = 0
    do while(.not. heapempty())
        call heappop(pcost, inode)
        lnr = graph(inode)%linkrut
        lnrAbs=abs(lnr)
        l   = ilink(lnrAbs)
        if (mark(lnrAbs)) CYCLE
        mark(lnrAbs)=.TRUE.
        if(.false..and.MaxCostSeen /= 0 .and. (pcost/MaxCostSeen) < 0.999991) then
        print *,  'PASMIN ',pcost,MaxCostSeen, pcost/MaxCostSeen, heaplen
        print *,  '  from ',numzon(i), ' to', numzon(j)
        status = msg_PasosInconsistente
        call heapcheck
        return
        endif
        MaxCostSeen=pcost
        nf=idzon(l)
        if(nf /= 0) then
        if(nf == j)  then
            status=msg_OK
            exit ! end of search
        endif
        cycle  ! paths cannot go through zones
        endif
        ! search forcandidates
        cnode  = lnrData(lnrAbs)%connected
        do while(graph(cnode)%linkrut /= 0)
        lnrcan = abs(graph(cnode)%linkrut)
        if (.not. mark(lnrcan)) then
            costocan = xcmin(lnrAbs) + node_ovr_cost(cnode)
            if (costocan < xcmin(lnrcan)) then
                xcmin(lnrcan) = costocan
                xpmin(lnrcan) = lnr
                cosReal(lnrcan) = cosReal(lnrabs) + node_cost(cnode)
                costocan = costocan + astar(lnrcan)
                call heappush(costocan, cnode)
            endif
        endif
        cnode = cnode+1
        enddo
        ihmax=heaplen
    enddo
    if(status.eq.msg_OK) then
        cost = cosReal(lnrabs)
        call BuildPath(xpmin,lnr, path, pathSize, status)
    endif
end subroutine

MALLA

One of my first projects consisted in modeling space frame structures, calculating their per-beam stress, and optimizing the amount of material used in the struts that would bear the loads.

The system was developed using C, C++, and Borland Delphi.

Different views in MALLA

A nice perspective from MALLA

SEPA

For many years since 1986 I helped the School of Architecture and the School of Law of the Universidad Central de Venezuela (Central University of Venezuela) process their admission exams.

Results and answers in SEPA

EVA

When an entrepreneur, I used to take consulting and programming gigs with companies, institutions, and people who knew my work.

EVA was a small system that would take the answers given by students about their teacher’s performance, and present the results in a unified UI, by grade, section, teacher, subject, or school. The grades were graphically represented by martial art belt colors.

Evaluation of teachers of Colegio La Salle by EVA

Gol 2014

Well, I haven’t kept completely away from Web User Interfaces. I built responsive UIs for several projets using SQLAlchemy, Flask, WTForms, and Bootstrap. Because play is important, on spare time I built a static site generator for tracking friendly bets over the 2014 FIFA World Cup:

Bet results for a player in Gol 2014

Consulting

My consulting work consisted mostly in creating analysis, design, testing, and planning documents and presentations, and sometimes leading a team to deliver the software. It included work for some very large companies in Venezuela and the US.

The following process diagram describes the project plan for the time-limited migration of a software system in Java to Visual Basic as part of the sale of a corporate business unit. The parallelization allowed delivery of the product before time, with zero bugs.

GST

The Words

I was a writer for Windows Tech Journal, VB Tech Journal, and Java Report. I was also an associate editor for Windows Tech Journal, and a correspondent for In Publishing while writing for the Borland Community Site about Open-Source Software.

VB Tech Journal Editorial