Pages

Monday, December 3, 2012

ANTLR, first impression

This post continues the story from the previous one.

So, following Stack Overflow link I landed at ANTLR homepage. Psychological scam alarms went off, bunch of text riddled with hyperlinks and varying typefaces, prominent button accompanied with "Oh come on, download me now", testimonials column and a guy making silly face. It took me a few seconds to actually start comprehending the content. Image below silly guy encouraged me to stay, it looked like FSA visualization I once saw during the college and FSA are crucial part of formal language processors. I started to search for something that identifies what the site is about and I read the first few lines:
What is ANTLR?
ANTLR, ANother Tool for Language Recognition,
... lot of technical terms that ordinary eye skips ...
 target language.
"OK, that looks like what I'm looking for", said to myself. I'll stop with through storytelling now and the silly guy is Terence Parr, the guy behind the ANTLR project. Anyway, next thing was to have a taste how the tool works. I was disappointed at first with a fact that it is a command line tool but that wouldn't stop me from learning it. Soon I've found out that it has additional tool with GUI, ANTLRWorks, so I downloaded it and started looking for some kind of tutorial. Thing is with this kind of software that you have to learn a metalanguage for defining grammar rules and examples are best way to start. I've found a good one on a different web site. It's an example of not so basic calculator/interpreter for Java. I copy-pasted all three grammars ANTLRWorks, pressed "generate code" button and it worked. I had a clue how it works but wasn't really ready to work on my own so I've tried different approach, make a parser for a very simple language.


If you remember old days in LOGO, you'll smile. My first informatics class in elementary school (5th grade, I remember it as if it was yesterday :) ) was about LOGO and moving the turtle with FD and RT commands. FD 100 moved the turtle forward 100 pixels and as it moved it left a straight line. RT 90 rotated turtle 90° clockwise. Repeat that four times and you'd draw a rectangle. That's the simple a language I've decided to parse. Also, I've decided to split the grammar to three parts because last example did so and I believed it's a proper way to do and this was my grammar file for lexer:

lexer grammar MyLogoLexer;

options {
language=CSharp3;
}

fragment SIGN
: '+' | '-';
fragment SPACE
: ' ' | '\t';

NUMBER : '0' | SIGN? '1'..'9' '0'..'9'*;
FORWARD : 'FD';
ROTATE : 'RT';

NEWLINE : ('\r'? '\n')+ { Skip(); };
WHITESPACE
: SPACE+ { Skip(); };

Pretty simple vocabulary, white spaces are skipped, digits are treated as numbers and keywords are FD and RT. Second grammar describes what is formally called the parser, a syntax analyzer and a part that build a syntax tree. For the language I was building, the parser was even simpler than lexer:

parser grammar MyLogoParser;

options {
language=CSharp3;
output = AST;
tokenVocab = MyLogoLexer;
}

script : statement* EOF!;

statement
: FORWARD NUMBER
| ROTATE NUMBER;

Basicaly the root of the tree is a script. Script contains zero or more statements and statements are either "forward" or "rotate" commands. ANTLRWorks is full of little features that help writing and validating grammar and graphical representation of a rule is one of them. Look at how nice it is, feast your eyes on it's awesomeness:


And now the last grammar, so called tree grammar, not because it builds the tree but because it traverses the tree and performs node actions:

tree grammar MyLogoTree;

options {
language=CSharp3;
ASTLabelType = CommonTree;
tokenVocab = MyLogoParser;
}

@header {
using System;
}

@members {
double x = 0, y = 0, angle = 0;

private double toDouble(CommonTree node) {
double value = 0.0;
   String text = node.Text;
   value = Double.Parse(text);
   return value;
  }
}
script : statement*;

statement 
: FORWARD v=NUMBER {
double length = toDouble($v);
angle = Math.PI * this.angle / 180;
x += Math.Cos(angle) * length;
y += Math.Sin(angle) * length;
Console.WriteLine("Moved to {0}, {1}", x.ToString("0.#"), y.ToString("0.#"));
}
| ROTATE v=NUMBER {
angle += toDouble($v);
Console.WriteLine("Facing {0}°", angle.ToString("0.#"));
};

When I wrote grammars, I've hit "generate code" button, got errors, corrected them, retried and after a few tries it succeeded. And than I got my self in from of the weird wall, how to run the generated code? It was midnight already and I was trying to break the habit staying up late. I should have been trivial matter but it turned out that it took two hours of my life. I expected that generated source code looked like this:


From the perspective of the user code, a single class that does the job. But instead it turned out that user's code has to mouth feed EVERY phase on it's own. See:



Notice the absence of gray arrows. Thing are even worse, there are additional objects the user code has to take care of besides generated classes. Mimicking the the Java example ("not so basic calculator" mentioned at the beginning), I figured the code should look like this:

static void Main(string[] args)
{
using (var input = new StreamReader("input.txt")) {
MyLogoLexer lexer = new MyLogoLexer(new ANTLRReaderStream(input));

MyLogoParser tokenParser = new MyLogoParser(new CommonTokenStream(lexer));
tokenParser.TreeAdaptor = new CommonTreeAdaptor();
var parserResult = tokenParser.script();

CommonTree tree = (CommonTree)parserResult.Tree;
MyLogoTree treeProcessor = new MyLogoTree(new CommonTreeNodeStream(tree));

treeProcessor.script();
}
}

But there were two big problems: parser's script method was private and tree processor's script method didn't exist at all. For method visibility issue the Google led me back to the Stack Overflow. Each parser and tree grammar generates a method and there is an undocumented feature that defines visibility of such methods. When generating Java code, the feature is ignored and all methods are public but when generating C# code methods are private unless the undocumented feature is used. Since the example that I used as learning material was meant to generate Java code, it didn't mention this issue. The lack of script method in tree parser was my fault entirely, I didn't noticed that the grammar rules has to be repeated in the tree processor even if they have no action. I mean, not all rules have to be repeated, only those that should generate a method.

To summarize, half of the workflow is good, enjoyable and looks mature but the other half is drastically unpolished. That contrast is the problem too, at first you'd live in high hopes that you are working in mature IDE such as Eclipse and Visual Studio and than BAM, this doesn't work, that doesn't work the way you thought and so on.

As I woke up next morning I looked at other options, there aren't looking better than ANTLR. There is one with nice premise but it's in alpha phase. So I decided to try that one and to give ANTLR another chance.

Saturday, December 1, 2012

ANTLR, introduction

I did a little research on libraries for processing formal languages, both for my self and for the project I'm working on, the Stareater. Previous versions of Stareater had a feature that allowed having whole expressions instead of plain constant numbers in data files. For instance, for farmer efficiency you could write something that evaluates to 3+0.125*level of farming technology in a single attribute instead of having multiple attributes, one for constant part, one for technology identifier and one for technology level influence factor. And on top of that, you can change operators in an expression any way you want. But there is a catch, you have to write it either normal or reverse Polish notation (postfix and prefix notations respectively) so the last example would look like this (highlighted part):


It's bearable when expressions are simple but take a look at the image below. That is an expression for the maximum number of  starports on a planet. Floor is a function for rounding down, "^" is the power operator, PLANET_VELICINA is the size of a planet and SVEMIRSKI_PROGRAM_LVL is a level of relevant technology.


It translates to Floor( (1 + 0.1 * (tech_level - 1)) * (planet_size ^ 1.5) / 100 ). So, why Polish notation instead of "plain English", the infix notation? I wrote about it before but this is a summary: it's easy to parse. Parser implementation is straightforward there is no operator precedence issue. For instance, evaluator of postfix notation expression could be explained in a few sentences (knowledge of basic data structures assumed): read input from left to right, add operands to the top of the stack, apply operators to the operands on the top of the stack, remove used operands and push the result to the top of the stack. If the expression was valid, in the end stack would contain only one element and that would be the result of the expression. Parser is very similar to described evaluator but instead of calculation, the so called syntax tree is built. What is syntax tree you might wonder. Well now we are stepping in the area of computer science and the rest of the post will assume that you have some degree of familiarity with the field of formal language processing.


This, the picture above, is the syntax tree for 3+0.125*level of farming expression. Arrow at the top points to the root of the tree (knowledge of the trees from graph theory assumed). To get the value of the expression, user code (the one using the tree) has to simply ask the root to evaluate itself. Rest is the magic of composite design pattern, root would see that it needs two or more operands and sum them and it will ask it's children to evaluate them self. Since the first child is a constant, it will simply return 3. Second child on the other hand will ask it's children to evaluate themselves and multiply their values. As you can see a nice utilization of recursion and dynamic polymorphism.

Building syntax tree for infix expression complicated thing, at first there are binary operators (assumed reader knows what that mean) which have one set of not-so-easy-to-implement rules, then add to the mix unary operators and you'd get stuff complicated by factor 1.5 and things can be complicated even further with brackets and functions. On top of that you have to decide whether a / b / c should be interpreted as (a / b) / ca / (b / c) or rejected as invalid. It would be hell to code and test all of it from scratch. Fortunately our fathers and grandfathers solved that for us when they invented compilers. Programmers use compilers for transforming a source code to an executable files but compilers are more general then that, they translate one language to another. In case of C++, source code written by the rules of C++ is translated to a machine code of the target hardware. There is fair amount of computer science theory involved here and it is named the formal language.

Main phases of compilation process

Every programming language is formal language and as such they have vocabulary (set of lexemes), grammar rules (syntax) and semantic rules. Vocabulary usually contains list of keywords (if, for and default for example) and regular expressions for variable names, constant numeric and textual values and other simple stuff though regular expressions are formal languages of their own. Grammar rules define more complicated stuff such as assignment has form of variable followed by '=' character followed by expression and ended with ';'. Semantic rules cover stuff which are hard to cover by the grammar such as checking if the type of a value can be stored in a variable. There are tools for generating source code for compiler's analysis phases, an inception one might say.

Problem of parsing text containing infix expression is basically that, lexical, syntax and semantic analysis. So I goggled a little bit for C# solutions and found that most queries lead to ANTLR, Java tools that can generate C# source code from grammar rules. More about my impressions with the tool in the following post(s).

Sunday, August 12, 2012

Blog za Zvjezdojedaca

Odlučih odvojiti "filozofske" postove od progress reportova Zvjezdojeca. Tako da će ubuduće postovi vezani za taj projekt ići na poseban blog:

http://stareater4x.blogspot.com/

Blog će biti pisan na engleskom jeziku no početnih par postova će biti na hrvatskom jer je riječ o postovima s ovog bloga. I da, verzija 0.4 je gotova.

Monday, June 4, 2012

Zvjezdojedac 0.3.3


http://code.google.com/p/zvjezdojedac/downloads/list

Prošli tjedan stavih novu verziju Zvjezdojeca na download. Kao što sam pisao u prošlim postovima, u novoj verziji je implementiran space combat. Uz sam space combat dodah još malo contenta, warp pogon i čestični štit. Doduše te komponente su dodane čisto zbog testa space combata.

Thursday, May 24, 2012

Space combat commitan

Whoohoo! Nakon pet mjeseci source code je konačno stabilan. Doduše ima jedno mjesto gdje se krši, u starom sučelju za dizajniranje brodova. Sredim to, istestiram još malo bitke, malo zglancam sučelje za bitke i ide novi build u download sekciju.

Monday, May 21, 2012

Space comat radi

Space combat u Zvjezdojecu je implementiran. Još ga nisam do kraja testirao ali chance to hit, damage reduction, i uništavanje brodova funkcionira. Čim stavim da se bitke mogu razriješiti, uploadam u repozitorij. Trenutno se combat turnovi odbrojavaju ali se ništa ne dogodi kad se prođe "zadnji" i potpuno uništenje protivničke strane isto trenutno ne dovodi do razrješenja bitke.

Našao sam zašto se događao bug kod učitavanja igre, zbog kojeg si igrači odjednom dobe par tisuća brodova. Stari problem s decimalnom točkom. Količina ostatka gradnje je bila pohranjena s decimalnim zarezom (regionalne postavke na Windowsima) a kod učitavanja se tražila točka (eksplicitno specificiran format). MICROSOFTE, ZAŠTO MJEŠAŠ STROJNI I KULTURNO OVISAN PRIKAZ BROJEVA? Naime, C#-u double tip podatka služi za pohranu realnih brojeva (uz kolku tolku preciznost), na svakoj varijabli i konstanti tipa double se može pozvat metoda za pretvorbu u tekst (tako reći). Moj problem nastaje u tome što se broj pretvara u tekst poštujući regionalne postavke operativnog sustava. Osobno smatram da se to ne bi trebalo tako raditi je vrijednost tipa double sama po sebi ne pripada ni jednoj državi ili kulturi. Kada se double vrijednost pita za tekstualni zapis, trebao bi se koristiti neki univerzalni format. Univerzalan u smislu da ne ovisi o ničem osim o samoj brojčanoj vrijednosti. Turčin, Arap, Kanađanin i Francuz bi za istu double vrijednost trebali dobiti isti tekstualni zapis. Za prikaz broja u kulturno ovisnom zapisu bi se trebala koristiti metoda kojoj se eksplicitno definira način pretvorbe broja u tekst.

Nazad na temu. Zadnjih par dana sam se uhvatio u koštac s balansiranjem igre. Htio sam se koncentrirati na testiranje i dovršavanje space combata ali brojke koje sam još davno postavio za snagu napada i obrane su takve da bitka traje u nedogled. Kako sam bio prije testirao tehnologije i mehanizme automatske nadogradnje tako je ispalo da igrači odmah u startu imaju do kraja nadograđen tier 3 oklop i da obični kolonizator ima 12k hit pointa dok je napadač s kojim sam testirao imao 8 strojnica koje su na taj oklop radile oko 9 damagea. S takvim brojkama bitka je trajala 170 krugova a ciljam na to da je ograničenje broja krugova za bitku negdje oko 20. Problem kod kolonizatora je što moraju s jedne strane biti ogromni a s druge ranjivi. Već prije sam razmišljao o tom da dodam specijalnu opremu za proširenje teretnog prostora i mislim da ću to napraviti u ovoj verziji. Nešto u stilu battle podsa iz Master of Oriona 2 ali da se uz povećanje nosivost se smanje hit pointi. To ja lagan dio posla, ono što je crna rupa koja upija sate mog rada je izbor brojki za damage oružja. Recimo za sam što na empirijski što na egzaktan način došao do načela po kojem određujem brojke za nosivost i izdržljivost vrsta trupova i sve komponente koje utjeću na vjerojatnost pogotka. Ali brojke za damage mi nikako ne idu. Načela kojih bih se držati su da brodovi iste veličine i tehnologije se međusobno unište unutar 5 krugova i da manje više svaki vojni brod može u dogledno vrijeme uništiti kolonizatora.

No već ću nešto smislit, dovršiti testiranje i uploadati u repozitorij. Nakon toga malo dotjerati GUI, spakirati u release i staviti na download.

Wednesday, April 4, 2012

Zvjezdojedac, zastavice

Zar je već prošlo mjesec dana od zadnjeg posta? Hmm, dobar ovaj posao koji radim kad mi ne primijetim da vrijeme proletjelo.

GUI za space combat napreduje polako ali sigurno. Konačno sam napravio prikaz pozicija brodova (ona 4 crna kvadratića iznad popisa brodova). Prikazuju se trokutići koji govore na kojoj poziciji ima brodova i kamo se kreću, prikazuju se zastavice za prisutne igrače na pojedinoj poziciji i klik na kvadratić šalje GUI event s potrebnim podacima. Obrada tog događaja baš ne funkcionira najbolje ali to je slijedeće na dnevnom redu. Još mi od vidljivog dijela GUIa nedostaju kontrole za upravljanje brodovima i za upravljanje bitkom (end turn i auto combat gumbi) no to već imam spremno u obliku skice.

Htio sam biti nadobudan prošlu nedjelju i napraviti kompletni space combat u jednom danu. Kako je taj dan bio prvi april, dobro sam se našalio. Dan je završio tako da sam u SPAZu došao s levela 25 na level 48. Realno, mislim da za dovršiti space combat treba više od 12 h. Jučer sam radio na pretvaranju "taktike" i "poželjne pozicije" i potrošio sam oko sat i pol. Po starom modelu brodova za dizajn se trebala definirati taktika, da li će brod funkcionirati kao presretač, da li će se držati po starni ili će biti na nekoj srednjoj udaljenosti. Po novom modelu brodovi se mogu kretati po pravcu i taktike su se pretvorile u pozicije. Najbitnija razlika je da je za svaku taktiku bilo posebno definirano kakvi su joj efekti a za pozicije jedan set formule koje vrijede za sve pozicije. Efekti pozicije računaju se tako da se u formule se uvrsti udaljenost između brodova. Ne izgleda kao velik posao ali eto, ode sat vremena samo tako jer promjene nisu samo u jezgri igre već i na GUIu i u data fajlama (karakteristike, nazivi i prijevodi) i još k tome dosta sam se vrtio koji razred instancira kojeg.

Jedno dana kada ću imati vremena, prerovati ću source code vezan za dizajn brodova. Trenutno za svaki tip komponente broda postoje dva razreda, jedan koji općenito opisuje komponente (tzv. info) i jedan koji opisuje konkretnu instancu. Ono što mi se ne sviđa je da su info razredi ugnježdeni pripadnim razredima za konkretne. Razlog tome je bio funkcionalnost da samo info razredi mogu napraviti instancu konkretne komponente. Ta funkcionalnost je ostvarana tako da razred za komponente imaju privatni konstruktor a info razredi pošto su ugnježdeni jedini imaju pristup tom konstruktoru. Imam ideju kako to smisleno izvesti bez ugnježđivanja razreda ali o tom potom, to je low priority.

Vidjeh da ima već 3 mjeseca od zadnjeg commita koda u repozitorij. U međuvremenu se zbilo dosta promjena ali stvar još uvijek nije stabilna a ne volim objavljivat kod koji ne radi. Trenutno kada se inicira space combat, GUI za bitku se ne da ugasiti i igrač ne može dalje igrati. Naime, kada se ugasi prozor za bitku, program zatraži igru slijedeći nerazriješeni konflikt a pošto se trenutno konflikti ne mogu razriješiti, igra mu vrati isti i tako u beskonačnost. Kad napravim space combat, tako će sjesti jedan fini debeli commit, uff.

Friday, March 9, 2012

Zvjezdojedac, space combat GUI

Polako ali ide. Svako tolko naletim na nešto na grafičkom sučelju što nemrem riješiti s nečim gotovim već moram potrošiti cijelu noć. Source code ću uploadati u repozitorij tek kada bude stabilan a to će biti tek kada će bitke funkcionirati. Naime, trenutno se ne može započeti novi krug ako se se ne razriješe sve bitke a bez funkcionalnog space combata to se ne može. Napravio sam tako sustav kako bi kasnije lakše odvojio "dužnosti" jezgre igre i grafičkog sučelja. No, da ne bi sve ostalo na obećanjima i frazama tipa "radim, radim" i "bit će, bit će", prilažem screenshoot trenutnog stanja sučelja za svemirske bitke:


Crni kvadrati s crvenim linijama su polja s pozicijama boraca. U kvadratima će se nalaziti oznake tipa da li igrač ima brodove na toj poziciji i da li mu dio brodova odlazi ili dolazi na tu poziciju. Klikom na kvadratić će se prikazati koji su sve brodovi na toj poziciji. Za prikaz liste brodova predviđen je prostor ispod. Na screenshootu je prikazana situacija kada svaka strana ima jedan tip brodova pa se ne vidi sva funkcionalnost liste. I da, iz nekog čudnog razloga bot je uspio u jednom krugu napraviti 97 tisuća colony shipova :). Blah, mislim da je bug u pamćenju ostatka gradnje. Uglavnom, ideja je da svaka stavka u listi skup brodova s istim dizajnom i da su prikazani prosječna izdržljivost štitova i oklopa za takve grupe.

Now back to work.

Monday, February 27, 2012

Zvjezdojedac i Lovac na demone

Kao i obično, razvoj igre ide jako sporo. Nažalost, čini mi se kako Zvjezdojeca radim isključivo za vlastitu dušu. Eh, da ima više za pokazati, više bi ga reklamirao a time i više radio na njemu.

Anyway. Odlučio sam svaki dan napraviti nešto u Zvjezdojecu, pa makar to bila 1 linija koda. Naime, kako to biva, ono što realno treba isprogramirati je dosta kompleksnije od ono što vidi u fazi razmišljanja. Probao sam pristupiti implementaciji space combata na način da si uzmem jedan dana u tjednu i kodiram. Je l, dizajn dokument sam napisao, samo ga treba pretočiti u kod. Ispalo je da ima još fina količina posla oko arhitekture koda. Pristup da svaki dan napravim nešto je super za razbijanje blokada. Kad zapnem na nečem, ugasim razvojnu okolinu, slijedeći dan razmislim i natipkam rješejne.

Za sada imam napravljeno detektiranje situacija u kojima se događa svemirska bitka, opisnike stanja bitke i pripremljene podatke za prikaz na grafičkom sučelju. Možda se ovo ne čini kao puno ili težak posao ali dosta mi je problema stvarala činjenica da se krug (turn) ne može isprocesirati do kraja dok se ne obrade svemirske bitke. Konkretno, problem je bio kako uskladiti jezgru igre i grafičko sučelje, tko koga pušta dalje i tko kada barata s kojim podacima.

U međuvremenu sam radio malo na prototipu RPGa u koje bi isprobao onu silnu teoriju koju sam pisao u prošlim postovima, vezanim za attack, defense i armor reductition. Zapravo, ne bi nazvao to RPGom, više neki SBG (stats building game). Za sada imam napravljeno kreiranje lika, saveanje, loadanje, jedan probni quest i skoro gotov mehanizam za bitke. Kada bude koliko toliko gotovo, staviti ću link za download. No, ima jedna lijepa stvar a to je da iskustvo izrade mehanizama za bitke iz ove igre mogu iskoristit u Zvijezdojecu.