Juancarlo Añez, a Portfolio¶
The usual Curriculum Vitae describes well the roles and responsibilities in each work endeavour, but not so much the work that was done. That’s why I thought it would be good to display some of the work I’ve done over the years in an easy to grok way: a portfolio
Please send any queries or comments by email, by voice to my mobile or office numbers, or through telegram or skype.
The Work¶
A portfolio should start with a focus on the most recent work.
The Python PEG Parser¶
On 2019 I collaborated with Guido van Rossum on the design of the project and architecture of a new parser for Python using PEG. The initial prototyping was done with TatSu. Later Guido assembled a team that created a solid implementation so quickly that the new parser was included in Python 3.9 by its release date on October 2020.
Web Crawling¶
I’ve been crawling the Web since 2017, and I’ve learned some fundamentals:
-
Information has value, so everyone wants to crawl, and everyone else wants to avoid crawling.
-
Math (statistics, data sciences, machine learning) is the key on the niche. Crawlers use it to be human-like, sites use it to spot the lie.
-
As long as human users are able to crawl, bots will too, as long as they behave humanely.
This is code to apply the Multi armed bandit strategy to header-profile selection on a crawling bot, so the choice converges to what the crawled site likes:
def choose(self):
g = self.profile_randomization
k = len(self.weight)
s = sum(self.weight)
# probabilities
p = [((1 - g) * w / s) + (g / k) for w in self.weight]
# re-calculate weights
for j, _ in enumerate(self.weight):
m = self.success[j]
n = self.failure[j]
x = m / (m + n) if n + m else 0
expected_reward = (1 - g) * p[j] * x / k
self.weight = normalized_weights(self.weight, lower=1.0, upper=100.0)
return weighted_choice(p)
With the library I wrote for session management, ban detection, and challenge resolution, a crawler can solve the challenge posted by a site with code like this:
def is_challenge_response(self, response):
return response.css('head script:contains(site-id)::text')
def make_challenge_request(self, response, request):
value = self.solve_challenge(response)
if not value:
return
return FormRequest.from_response(
response,
formdata={
'secret': str(value)
},
)
The following animation shows directory structure and my commitsfor the project I worked on during 2017 and 2018. The large cluster is is the set of spiders/crawlers. The “satellites” are refactorings of complex crawlers. The long branch consists of contributions to the common libraries.
The Parsers¶
From 2012 to 2016 I built parsers, analyzers, and translators for the COBOL, Java, M2, Natural, Power Script, SQL, and Visual Basic programming languages. As part of that effort I also wrote the TatSu parser generator (see below).
Those parsers were sold under terms that don’t allow me to publish anything about them, yet, in 2016 I wrote parsers for COBOL, M204, IBM MFS, Natural, Power Script, SQL, and Visual Basic again, clean-room (from scratch).
The following Gource animation shows the commits by the team to the parsers and the front-end webapps during 2016. The parsers are the projects in the center, and the webapps the flower-ended structures on the border. I am the dancing white box with the dragon logo.
The following charts show my commits relative to other members of the team.
-
TatSu
-
COBOL
-
Code Generation Webapp
-
Deployment Automation
-
Java
-
Legacy Analysis Webapp
-
MFS
-
Natural
-
PowerScript
-
Commons
-
SQL
-
Vistual Basic
TatSu¶
During 2012 ANTLR v4 was still a work in progress, and I had great inconvenience trying to make ANTLR v3 parse throught the ambiguous constructs of programming languages such as COBOL and Natural. I convinced my boss of developing our own parser generator which, though simple, was able to handle the ambiguities without all the work ANTLR was requiring.
After forty continuous days of work beginning in mid December 2012, Grako was published as open-source on February 2013. Grako later became TatSu. This is the Google Analytics map of visits to the TatSu’s pages.
This is a fragment of the TatSu grammar for Java:
method_declaration_mixin
=
type_parameters:type_parameters
type:type name:identifier
parameters:formal_parameters
dimensions:dimensions
['throws' throws:qualified_name_list]
[
| ';' # for abstract and interface methods
| block:block [';' ~ ] # a trailing ;
]
;
method_declaration::MethodDeclaration
=
modifiers:{modifier}
>method_declaration_mixin
;
interface_method_declaration::MethodDeclaration
=
modifiers:{interface_method_modifier}
>method_declaration_mixin
;
And this is part of the corresponding Python code generated for the parser:
@graken()
def _method_declaration_mixin_(self):
self._type_parameters_()
self.name_last_node('type_parameters')
self._type_()
self.name_last_node('type')
self._identifier_()
self.name_last_node('name')
self._formal_parameters_()
self.name_last_node('parameters')
self._dimensions_()
self.name_last_node('dimensions')
with self._optional():
self._token('throws')
self._qualified_name_list_()
self.name_last_node('throws')
with self._optional():
with self._choice():
with self._option():
self._token(';')
with self._option():
self._block_()
self.name_last_node('block')
with self._optional():
self._token(';')
self._cut()
self._error('expecting one of: ;')
self.ast._define(
['block', 'dimensions', 'name', 'parameters', 'throws', 'type', 'type_parameters'],
[]
)
The Practice¶
Continuous study and practice is the big secret to all important achievements.
Exercism¶
I have been completing the Python, Swift, and Go tracks at the Exercism web site as a way to practice and obtain reviews from some of my peers.
My solutions to the problems can be found here. This is a sample of the kind of solutions I wrote:
Project Euler¶
I wrote about Python in 2000, and started using it for projects in 2011, when it became obvious the programming language would become mainstream and important. To gain experience, among other efforts, I solved the problems at the Project Euler web site. My solutions to >50 of the problems are on this repository.
This is a sample:
#!/usr/bin/env python
"""
Solution to Project Euler Problem 24
http://projecteuler.net/
by Apalala <apalala@gmail.com>
(cc) Attribution-ShareAlike
http://creativecommons.org/licenses/by-sa/3.0/
A permutation is an ordered arrangement of objects. For example, 3124 is one
possible permutation of the digits 1, 2, 3 and 4. If all of the permutations
are listed numerically or alphabetically, we call it lexicographic order. The
lexicographic permutations of 0, 1 and 2 are:
012 021 102 120 201 210
What is the millionth lexicographic permutation of the digits 0, 1, 2, 3, 4,
5, 6, 7, 8 and 9?
"""
from itertools import permutations
from itertools import islice
def nth_element(iterable, n):
return next(islice(iterable, n - 1, n))
def nth_permutation(digits, n):
return ''.join(nth_element(permutations(digits), n))
def test():
assert '210' == nth_permutation('012', 6)
if __name__ == '__main__':
test()
print(nth_permutation('0123456789', 10 ** 6))
The Story¶
Experience takes work… and time. This section is about some of the things I’ve been involved in since 1985.
I’ve always been good at designing user interfaces, but I’ve kept away from it for some years in the hopes that the technologies behind WUI stabilize a bit (and they haven’t). The main goal of a UI should be to present information in a way that allows users to obtain meaning from it.
DUnit¶
I was the original developer of DUnit, the library that became the official testing framework for Embarcadero Delphi (previously Borland Delphi) until it was replaced by DunitX.
TRANUS¶
I was co-owner, CTO, and main programmer in Modelistica from 1985 to 2005. During that period I made important contributions to the very successful TRANUS urban and regional simulation system, and I developed its user interface, to make the lives of the modelers more enjoyable.
The original model was developed using Fortran, and the UI was first in Visual Basic and then in Borland Delphi.
The image below is of TRANUS displaying the road types for the city of Recife in Brazil.
To be able to account for forbidden turns and turn delays at intersections, the internals of the model worked with the dual of the network graph. This is the FORTRAN code for finding a minimum path in the dual graph:
subroutine FindPath(i, j, astar, path, pathSize, cost, status)
real astar(*)
integer path(*)
integer pathSize
integer status
integer(4) :: IHMAX = 0
real :: xcmin(NLINKRUT)
integer :: xpmin(NLINKRUT)
real :: cosReal(NLINKRUT)
integer l, lnrCan, irfte, iofte,lrc
double precision :: ctransf,costocan, MaxCostSeen, pcost
integer nh, lnr, lnrAbs, ircan
logical(1) :: mark(NLINKRUT)
integer(4) :: inode, cnode
177 format(/,' N', I6,' Or',I6,' Des',I6,' T',I6,' R',I6,' C',2F12.5)
xcmin = RINF
xpmin = 0
cosReal = RINF
call heapalloc(MXCON*NLINKRUT)
inode = matz(i)
do while(graph(inode)%linkrut /= 0)
pcost = node_ovr_cost(inode)
lnr = abs(graph(inode)%linkrut)
xcmin(lnr) = pcost
cosReal(lnr) = node_cost(inode)
call heappush(pcost + astar(lnr), inode)
inode = inode+1
end do
status = msg_NoPath
mark =.FALSE.
MaxCostSeen=0
nhmax = 0
do while(.not. heapempty())
call heappop(pcost, inode)
lnr = graph(inode)%linkrut
lnrAbs=abs(lnr)
l = ilink(lnrAbs)
if (mark(lnrAbs)) CYCLE
mark(lnrAbs)=.TRUE.
if(.false..and.MaxCostSeen /= 0 .and. (pcost/MaxCostSeen) < 0.999991) then
print *, 'PASMIN ',pcost,MaxCostSeen, pcost/MaxCostSeen, heaplen
print *, ' from ',numzon(i), ' to', numzon(j)
status = msg_PasosInconsistente
call heapcheck
return
endif
MaxCostSeen=pcost
nf=idzon(l)
if(nf /= 0) then
if(nf == j) then
status=msg_OK
exit ! end of search
endif
cycle ! paths cannot go through zones
endif
! search forcandidates
cnode = lnrData(lnrAbs)%connected
do while(graph(cnode)%linkrut /= 0)
lnrcan = abs(graph(cnode)%linkrut)
if (.not. mark(lnrcan)) then
costocan = xcmin(lnrAbs) + node_ovr_cost(cnode)
if (costocan < xcmin(lnrcan)) then
xcmin(lnrcan) = costocan
xpmin(lnrcan) = lnr
cosReal(lnrcan) = cosReal(lnrabs) + node_cost(cnode)
costocan = costocan + astar(lnrcan)
call heappush(costocan, cnode)
endif
endif
cnode = cnode+1
enddo
ihmax=heaplen
enddo
if(status.eq.msg_OK) then
cost = cosReal(lnrabs)
call BuildPath(xpmin,lnr, path, pathSize, status)
endif
end subroutine
MALLA¶
One of my first projects consisted in modeling space frame structures, calculating their per-beam stress, and optimizing the amount of material used in the struts that would bear the loads.
The system was developed using C, C++, and Borland Delphi.
SEPA¶
For many years since 1986 I helped the School of Architecture and the School of Law of the Universidad Central de Venezuela (Central University of Venezuela) process their admission exams.
EVA¶
When an entrepreneur, I used to take consulting and programming gigs with companies, institutions, and people who knew my work.
EVA was a small system that would take the answers given by students about their teacher’s performance, and present the results in a unified UI, by grade, section, teacher, subject, or school. The grades were graphically represented by martial art belt colors.
Gol 2014¶
Well, I haven’t kept completely away from Web User Interfaces. I built responsive UIs for several projets using SQLAlchemy, Flask, WTForms, and Bootstrap. Because play is important, on spare time I built a static site generator for tracking friendly bets over the 2014 FIFA World Cup:
Consulting¶
My consulting work consisted mostly in creating analysis, design, testing, and planning documents and presentations, and sometimes leading a team to deliver the software. It included work for some very large companies in Venezuela and the US.
The following process diagram describes the project plan for the time-limited migration of a software system in Java to Visual Basic as part of the sale of a corporate business unit. The parallelization allowed delivery of the product before time, with zero bugs.
The Words¶
I was a writer for Windows Tech Journal, VB Tech Journal, and Java Report. I was also an associate editor for Windows Tech Journal, and a correspondent for In Publishing while writing for the Borland Community Site about Open-Source Software.