Tuesday, September 17, 2013

Python pages

Short cuts to useful Python tutorials from other web pages.
Snippets of code from the other pages is included in this blog.
Attribution is given int he form of a link to the original reference.

Lamda and multiple args

Python lists explained
The list type is a container that holds a number of other objects, in a given order. The list type implements the sequence protocol, and also allows you to add and remove objects from the sequence.

Python List len() Method

list1, list2 = [123, 'xyz', 'zara'], [456, 'abc']
print "First list length : ", len(list1);
print "Second list length : ", len(list2);


The Python yield keyword explained
3,400+ votes on Stack Overflow

calling the function str.lower and assigning the return value which is definitely not iterable
Change the line
content = inputFile.read().lower
to
content = inputFile.read().lower()
Your original line assigns the built-in function lower to your variable content instead of calling the function str.lower and assigning the return value which is definitely not iterable.

Python sets
> > > y = [1, 1, 6, 6, 6, 6, 6, 8, 8]
> > >; sorted(set(y))
[1, 6, 8]

> > > s = set([1,6,8])
> > > print(s)
{8, 1, 6}
> > > s.update(range(10,100000))
> > > for v in range(10, 100000):
    s.remove(v> > >> > > print(s){1, 6, 8}

Python regexs
Python regexs 2

NameError: name 're' is not defined
-import the re module

Python - Using regex to find multiple matches and print them out

line = 'bla bla bla
Form 1
some text...
Form 2
more text?'
matches = re.findall('
(.*?)
'
, line, re.S) print matches


Python Dictionary (hashes)

Python : List of dict, if exists increment a dict value, if not append a new dict

urls = {'http://www.google.fr/' : 1 }
for url in list_of_urls:
    if not url in urls:
        urls[url] = 1
    else:
        urls[url] += 1
http://stackoverflow.com/questions/13757835/make-python-list-unique-in-functional-way-map-reduce-filter?rq=1

Is there a way in Python of making a List unique through functional paradigm ?
Input : [1,2,2,3,3,3,4]
Output: [1,2,3,4] (In order preserving manner)




In [29]: a = [1,2,2,3,3,3,4]
In [30]: reduce(lambda ac, v: ac + [v] if v not in ac else ac, a, [])
Out[30]: [1, 2, 3, 4]


Pretty printing a dictionary
Use the module name to reference the pprint function
import pprint
pprint.pprint(...)

Python sets

Python also includes a data type for sets. A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
Curly braces or the set() function can be used to create sets. Note: to create an empty set you have to use set(), not {}; the latter creates an empty dictionary, a data structure that we discuss in the next section.
Here is a brief demonstration:
>>>
>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
>>> fruit = set(basket)               # create a set without duplicates
>>> fruit
set(['orange', 'pear', 'apple', 'banana'])
>>> 'orange' in fruit                 # fast membership testing
True
>>> 'crabgrass' in fruit
False

>>> # Demonstrate set operations on unique letters from two words
...
>>> a = set('abracadabra')
>>> b = set('alacazam')
>>> a                                  # unique letters in a
set(['a', 'r', 'b', 'c', 'd'])
>>> a - b                              # letters in a but not in b
set(['r', 'd', 'b'])
>>> a | b                              # letters in either a or b
set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
>>> a & b                              # letters in both a and b
set(['a', 'c'])
>>> a ^ b                              # letters in a or b but not both
set(['r', 'd', 'b', 'm', 'z', 'l'])
Similarly to list comprehensions, set comprehensions are also supported:
>>>
>>> a = {x for x in 'abracadabra' if x not in 'abc'}
>>> a
set(['r', 'd'])

 python remove whitespaces in string

Looping through python regex matches

import re

s = "ABC12DEF3G56HIJ7"
pattern = re.compile(r'([A-Z]+)([0-9]+)')

for (letters, numbers) in re.findall(pattern, s):
    print numbers, '*', letters

Why does comparing strings in Python using either '==' or 'is' sometimes produce a different result?

is is identity testing, == is equality testing. what happens in your code would be emulated in the interpreter like this:

>>> a = 'pub'
>>> b = ''.join(['p', 'u', 'b'])
>>> a == b
True
>>> a is b
False

How to test if a dictionary contains a specific key? [duplicate]

x = {'a' : 1, 'b' : 2}
if (x.contains_key('a')):
    ....
How to check a string for specific characters?

 How to return more than one value from a function in Python? [duplicate]

Sunday, July 14, 2013

Coding conventions

People spend time in corporations writing coding convention documents. Often they think in terms of an ideal world. Enforcing coding conventions creates friction on teams. Coding conventions should be light weight. Coding conventions should make the code easier to read, understand, maintain, and debug. Coding conventions should not add unnecessary overhead to the project.

Sun Microsystems published the above Java coding convention in 1999. Most of the convention is irrelevant since formatting is handled by Eclipse.

I don't completely agree with their code structuring conventions. The first thing that their coding convention assumes is that programmers adequately comment their code. Since we know that this is not the case you should at least write code which is SELF DOCUMENTING. See below.

1. Embedding function calls inside of function calls may be faster to code but makes the code more difficult to debug, read, and understand. The code is also self documenting if the var names are descriptive. Since people put limited comments in their code using descriptive var names is the minimum requirement.

var = someMethod1(longExpression1,
                someMethod2(longExpression2,
                        longExpression3));

Should be

descriptive_var = someMethod2(longExpression2, longExpression3);

var = someMethod1(longExpression1, descriptive_var );

Now when you step through the debugger you can see the return value for the first method.
Others who read the code can see what the descriptive_var represents.
  
2. Complex boolean equations should be broken down into individual booelnas. The code is easier to debug, read, and understand. The code is also self documenting if the boolean names are descriptive. Since people put limited comments in their code using descriptive boolean names is the minimum requirement.

//DON'T USE THIS INDENTATION
if ((condition1 && condition2)
    || (condition3 && condition4)
    ||!(condition5 && condition6)) { //BAD WRAPS
    doSomethingAboutIt();            //MAKE THIS LINE EASY TO MISS
}

Should be

// each var is descriptive and describes the condition
// each var is in effect a COMMENT
// the debugger shows the value of each boolean in the complex conditional expression

dv1 = condition1 && condition;
dv2 = condition3 && condition4;
dv3 = !(condition5 && condition;
dv4 = dv1 || dv2 || dv3;

if (dv4) {
    doSomethingAboutIt();     
}

3.This is nonsense. The best programmers I know declare vars when they are first used in blocks not at the beginning of blocks. It makes the program easier to follow. You don't have to jump around to see the type of the var. The program looks like a dataflow/pipeline (data announces its arrival) and is easier to DEBUG.


6.3 Placement
Put declarations only at the beginning of blocks. (A block is any code surrounded by curly braces "{" and "}".) Don't wait to declare variables until their first use; it can confuse the unwary programmer and hamper code portability within the scope.

void myMethod() {
    int int1 = 0;         // beginning of method block

    if (condition) {
        int int2 = 0;     // beginning of "if" block
        ...
    }
}

4. Put vars in returns. Make the code self documenting and readable.

return (size ? size : defaultSize);

Should be

int realSizeOfSomeObject = (size ? size : defaultSize);
return realSizeOfSomeObject; 

5. There are some guys from MIT who completely disagree with putting the else on the same line as the left curly }. They think the else goes on the line below the curly. I agree with the MIT guys.

7.4 if, if-else, if else-if else Statements
The if-else class of statements should have the following form:
if (condition) {
    statements;
}

if (condition) {
    statements;
} else {
    statements;
}

if (condition) {
    statements;
} else if (condition) {
    statements;
} else {
    statements;
}

6. No assignments in condition checks...
Do not use the assignment operator in a place where it can be easily confused with the equality operator. Example:

if (c++ = d++) {        // AVOID! (Java disallows)
    ...
}
should be written as
if ((c++ = d++) != 0) {
    ...

}

Thursday, April 11, 2013

Gearman

Gearman is a great tool for transparent distribution of jobs to a pool of worker machines.

http://gearman.org/

Friday, February 8, 2013

How to improve server hardware

  •  
  • Derek Pappas
    DeleteDerek Pappas Maybe they will figure out that they need to take the servers apart, put the components in separate packages that have different cooling requirements, run the liquid cooling out of the building instead of using air, put the power supplies at the rack level instead of in the servers which they just heat up, and figure out to make the whole system like Lego so you can add as much capacity to each system as the CPU supports with respect to DRAM and HDD's.less

Thursday, February 7, 2013

Have integration meetings.-skip all the others.

In response to PG's YC article on meetings I have this to say.

Hiding in your office or cubicle staring at the monitor for 12 hours a day does not get projects finished. There are integration issues that have to be ironed out between developers. Successfully pushing data down entire software or hardware pipelines usually requires coordination and face to face interaction. Meetings have their place in the project development cycle. Integration of discrete software modules rarely happens without meetings. Particularly if the project is large and building something complex such as a graphics chip pipeline (yes that is software - RTL) or a data processing pipeline spread across two computer centers running down loaders, map reduce data processing jobs, and indexers.Good luck trying to get the engineers to talk to each other about integration without a regular meeting more than once a week (this is based on my experience at Intel and Sun). Of course there are other companies in the Valley that are less organized, where the people work harder and get less done under more pressure because they do not talk to each other (that would be a waste of time in the eyes of management). They tend not to have coordination meetings. And when they do have meetings the managers mostly do the talking. And good luck to the manager who thinks he can build a team with a common set of values without have meetings that go over the mission that the team is on. Meetings that review PERT charts on a weekly basis for the duration of the project are a waste of time. Meetings to review status are a waste of time. Meetings to discover issues are a waste of time. Engineers will tell you in private what the issues are. Intel used to have open communication when Grove was running the show. But that company and time was an exception. There is something to be said for aligning everyone behind one single short term milestone, meeting daily to ensure that the milestone is being met, meeting the milestone, and then letting the team take over the project management for the rest of the project.

Saturday, August 25, 2012

Imagej docs

http://stackoverflow.com/questions/7218309/smoothing-a-jagged-path
Tutorials
http://albert.rierol.net/imagej_programming_tutorials.html#ImageJ

API
http://rsbweb.nih.gov/ij/developer/api/ij/

http://rsbweb.nih.gov/ij/developer/api/index.html

Stackoverflow
 http://stackoverflow.com/questions/11495020/imagej-api-getting-rgb-values-from-image

ImageJ list
https://list.nih.gov/cgi-bin/wa.exe?A0=IMAGEJ 

Books
http://www.imagingbook.com/index.php?id=111
http://www.imagingbook.com/index.php?id=98

PPT
Larsen Image analysis-filtering

Math
http://www.cafeaulait.org/course/week4/40.html 

Code
http://www.myoops.org/twocw/mit/NR/rdonlyres/Civil-and-Environmental-Engineering/1-00Introduction-to-Computers-and-Engineering-Problem-SolvingFall2002/C5DB3655-BC62-4010-9A5D-DFE91D68F5BF/0/MedianFilter.java 
// http://download.java.net/jdk7/archive/b123/docs/api/java/awt/image/Raster.html
        // http://download.java.net/jdk7/archive/b123/docs/api/java/awt/image/WritableRaster.html
        // http://kickjava.com/src/java/awt/image/Raster.java.htm
        // http://kickjava.com/src/java/awt/image/BufferedImage.java.htm
        // http://stackoverflow.com/questions/5069678/cutting-up-image-illegalargumentexception-for-createwritablechild
        // http://www.exampledepot.com/egs/java.awt.image/ImagePixel.html

Object Outline code
http://stackoverflow.com/questions/10834109/java-create-a-shape-from-border-around-image

Larsen lectures
http://www.matdat.life.ku.dk/ia/sessions/ 

Greyscale code

// grayscale transparent image:

ColorSpace gsColorSpace = ColorSpace.getInstance(ColorSpace.CS_GRAY);

ComponentColorModel ccm = new ComponentColorModel(gsColorSpace, true, false, Transparency.TRANSLUCENT, DataBuffer.TYPE_BYTE);

WritableRaster raster = ccm.createCompatibleWritableRaster(width, height);

Image result = new BufferedImage(ccm, raster, ccm.isAlphaPremultiplied(), null);



///
public static BufferedImage convertToGray(BufferedImage image) {
BufferedImage gray = new BufferedImage(image.getWidth(),
image.getHeight(),BufferedImage.TYPE_BYTE_GRAY);
ColorConvertOp op = new ColorConvertOp(
image.getColorModel().getColorSpace(),
gray.getColorModel().getColorSpace(),null);
op.filter(image,gray);
return gray;
}

imagej and Springsource

No springsource installation instructions at http://developer.imagej.net/development.

Need to use a SCM git connector to import the ImageJ github files but STS (Eclipse) is broken.


http://developer.imagej.net/eclipse

 git clone git://github.com/imagej/imagej.git

Alternately, if you already checked out the source code outside Eclipse, you can use the File > Import Existing Maven Projects command to import the projects instead.

Tuesday, June 12, 2012

Erlang modules that provide a TAP testing client library.

Etap-program testing lib

 etap is a collection of Erlang modules that provide a TAP testing client library. These modules allow developers to create extensive and comprehensive tests covering many aspects of application and module development. This includes simple assertions, exceptions, the application behavior and event web requests. This library was originally written by Jeremy wall.

Sunday, June 10, 2012

Erlang: tv:start(). - no window pops up - solution

Original page...
 
> I tried tv:start/0 and toolbar:start/0: nothing happend:
>  1) tv:start just returned a PID... nothing more
>  2) toolbar:start was silent for some time and than returned:
>        ** exception exit: {startup_timeout,toolbar}
>             in function  toolbar:init_ok/1

It turns out (tcl/)tk wasn't installed on those systems in question.
After "sudo apt-get -y install tk-dev" tv and toolbar are working! - boris

Wednesday, June 6, 2012

Erlang and mnesia select

This is copied from Erlang and mnesia:select

 

Here is an example from the official Erlang docs. The query returns the name of each male person aged more then 30.
MatchHead = #person{name='$1', sex=male, age='$2', _='_'},
Guard = {'>', '$2', 30},
Result = '$1',
mnesia:select(Tab,[{MatchHead, [Guard], [Result]}]),
Criterias are expressed with $ and the whole thing becomes quite convulted for anything more complicated.
Furthermore it is impossible to do what would be basic operations in other database engines, like sorting the results.
But, a module exists that makes queries better legible . QLC that stands for Query List Comprehension. It supports Mnesia, ETS and DETS.
Here is the previous query, rewritten:
Query = qlc:q([Person#person.name || Person <- mnesia:table(Tab), Person#person.sex == male, Person#person.age > 30]),
In this case the query is expressed as a list comprehension. Criterias are written in a comprehensible manner in the second par of the list comprehension.
If you want to execute this query in Mnesia, you have to do so in a transaction.
-include_lib("stdlib/include/qlc.hrl")
Transaction = fun() ->
    Query = qlc:q([Person#person.name || Person <- mnesia:table(Tab), Person#person.sex == male, Person#person.age > 30]),
    qlc:eval(Query)
end,
mnesia:transaction(Transaction),
To efficiently sort the result, qlc provides qlc:sort.
-include_lib("stdlib/include/qlc.hrl")
Transaction = fun() ->
    Query = qlc:q([Person#person.name || Person <- mnesia:table(Tab), Person#person.sex == male, Person#person.age > 30]),
    Order = fun(A, B) ->
        B#person.age > A#person.age
    end,
    qlc:eval(qlc:sort(Query, [order, Order]))
end,
mnesia:transaction(Transaction),

Erlang and the query interface for tables

See QLC query interface for more details

MODULE

qlc

MODULE SUMMARY

Query Interface to Mnesia, ETS, Dets, etc

DESCRIPTION


The qlc module provides a query interface to Mnesia, ETS, Dets and other data structures that implement an iterator style traversal of objects.

Overview

The qlc module implements a query interface to QLC tables. Typical QLC tables are ETS, Dets, and Mnesia tables. There is also support for user defined tables, see the Implementing a QLC table section. A query is stated using Query List Comprehensions (QLCs). The answers to a query are determined by data in QLC tables that fulfill the constraints expressed by the QLCs of the query. QLCs are similar to ordinary list comprehensions as described in the Erlang Reference Manual and Programming Examples except that variables introduced in patterns cannot be used in list expressions. In fact, in the absence of optimizations and options such as cache and unique (see below), every QLC free of QLC tables evaluates to the same list of answers as the identical ordinary list comprehension.
While ordinary list comprehensions evaluate to lists, calling qlc:q/1,2 returns a Query Handle. To obtain all the answers to a query, qlc:eval/1,2 should be called with the query handle as first argument. Query handles are essentially functional objects ("funs") created in the module calling q/1,2. As the funs refer to the module's code, one should be careful not to keep query handles too long if the module's code is to be replaced. Code replacement is described in the Erlang Reference Manual. The list of answers can also be traversed in chunks by use of a Query Cursor. Query cursors are created by calling qlc:cursor/1,2 with a query handle as first argument. Query cursors are essentially Erlang processes. One answer at a time is sent from the query cursor process to the process that created the cursor.

Syntax

How to convert code to HTML for your blog

SimpleCode has a code to HTML convertor.


Enter normal (X)HTML in their markup box. Press "Process" and it will spit out entity-encoded markup suitable for examples. Use spaces in increments of two for nesting indents.

How to convert code to HTML for your blog.

Erlang and mnesia example

There is an mnesia example at the following url:

http://www.erlang.org/doc/apps/mnesia/Mnesia_chap2.html#id64245

However, the example code is incomplete.

So I added copied some of the Erlang examples functions and created some new functions to use on the DB.

Some of the functions from the page below do not seem to work (all_*).

%-----------------------------------------------------------------
%company.erl
% compile in the shell with
% > c(company).
%-----------------------------------------------------------------

-module(company). 
-export([
init/0,
insert_emp/3,
mk_projs/2,
mk_project/2,
raise/2,
all_females/0,
all_males/0,
raise_females/1,
over_write/2,
read_employee/1,
read_employee_table/2,
read_emp1/2
]).

-include_lib("stdlib/include/qlc.hrl").
-include("company.hrl").

%%----------------------------------------

init()                             ->
    mnesia:create_table(employee,
                        [{attributes, record_info(fields, employee)}]),
    mnesia:create_table(dept,
                        [{attributes, record_info(fields, dept)}]),
    mnesia:create_table(project,
                        [{attributes, record_info(fields, project)}]),
    mnesia:create_table(manager, [{type, bag},
                                  {attributes, record_info(fields, manager)}]),
    mnesia:create_table(at_dep,
[{attributes, record_info(fields, at_dep)}]),
    mnesia:create_table(in_proj, [{type, bag},
                                  {attributes, record_info(fields, in_proj)}]).

%%----------------------------------------

insert_emp(Emp, DeptId, ProjNames) ->
    Ename = Emp#employee.name,
    Fun = fun() ->
                  mnesia:write(Emp),
                  AtDep = #at_dep{emp = Ename, dept_id = DeptId},
                  mnesia:write(AtDep),
                  mk_projs(Ename, ProjNames)
          end,
    mnesia:transaction(Fun).

%%----------------------------------------

mk_projs(Ename, [ProjName|Tail])   ->
    mnesia:write(#in_proj{emp = Ename, proj_name = ProjName}),
    mk_projs(Ename, Tail);
mk_projs(_, [])                    -> ok.

%%----------------------------------------

mk_project(ProjName, ProjNumber)   ->
    mnesia:write(#project{name = ProjName, number = ProjNumber}).

%%----------------------------------------

raise(Eno, Raise)                  ->
    F = fun() ->
                [E] = mnesia:read(employee, Eno, write),
                Salary = E#employee.salary + Raise,
                New = E#employee{salary = Salary},
                mnesia:write(New)
        end,
    mnesia:transaction(F).

%%----------------------------------------

all_males()                        ->
    F = fun() ->
Male = #employee{sex = male, name = '$1', _ = '_'},
mnesia:select(employee, [{Male, [], ['$1']}])
        end,
    mnesia:transaction(F).

%%----------------------------------------

all_females()                      ->
    F = fun() ->
Female = #employee{sex = female, name = '$1', _ = '_'},
mnesia:select(employee, [{Female, [], ['$1']}])
        end,
    mnesia:transaction(F).

%%----------------------------------------

raise_females(Amount)              ->
    F = fun() ->
                Q = qlc:q([E || E <- mnesia:table(employee),
                                E#employee.sex == female]),
Fs = qlc:e(Q),
                over_write(Fs, Amount)
        end,
    mnesia:transaction(F).

%%----------------------------------------

over_write([E|Tail], Amount)       ->
    Salary = E#employee.salary + Amount,
    New = E#employee{salary = Salary},
    mnesia:write(New),
    1 + over_write(Tail, Amount);
over_write([], _)                  ->   0.

%%----------------------------------------

read_employee(Eno)                  ->
Fun = fun() ->
    mnesia:read({employee,Eno})
    end,
  {atomic,[Row]}=mnesia:transaction(Fun),
  io:format("~p~n",[Row#employee.name]).

%%----------------------------------------

read_employee_table(Table,Eno) ->
    mnesia:read({Table,Eno})
    .

%read_employee_name(Eno, Table, Field) ->
%  {atomic,[Row]}=mnesia:transaction(read_employee_table(Table, Eno)),
%  io:format("~p~n",[RowTable.Field]).
%

read_emp1(Eno, Field)                  ->
Fun = fun() ->
    mnesia:read({employee,Eno})
    end,
  {atomic,[Row]}=mnesia:transaction(Fun),
  io:format("~p~n",[Row#employee.Field]).



%  io:format("~p~n",[Row#employee.emp_no]).
% mnesia:read(employee, Eno, write).

%%----------------------------------------

%Fun = fun() ->
%   mnesia:read({employee,115018})
%   end,
% {atomic,[Row]}=mnesia:transaction(Fun),
% io:format("~p~n",[Row#employee.emp_no]).




%-----------------------------------------------------------------
%company.hrl
% compile in the shell with
% > rr("company").
%-----------------------------------------------------------------

-record(employee, {emp_no,
                   name,
                   salary,
                   sex,
                   phone,
                   room_no}).

-record(dept, {id,
               name}).

-record(project, {name,
                  number}).


-record(manager, {emp,
                  dept}).

-record(at_dep, {emp,
                 dept_id}).

-record(in_proj, {emp,
                  proj_name}).

 

%-----------------------------------------------------------------
% company_sample.erl
% compile in the shell with
% > c(company_sample).
%-----------------------------------------------------------------

 erl -mnesia dir '"/tmp/Mnesia.Company"'

mnesia:create_schema([node()]).
mnesia:start().
c(company.erl).
rr("company.hrl").
company:init().
mnesia:info().

%Employees
Emp0  = #employee{emp_no = 104465, name = "Johnson Torbjorn",  salary =  1, sex = male,   phone = 99184, room_no = {242,038}}, company:insert_emp(Emp0, 'B/SFR', [otp]).
Emp1  = #employee{emp_no = 107912, name = "Carlsson Tuula",    salary =  2, sex = female, phone = 94556, room_no = {242,056}}, company:insert_emp(Emp1, 'B/SFR', [otp]).
Emp2  = #employee{emp_no = 114872, name = "Dacker Bjarne",     salary =  3, sex = male,   phone = 99415, room_no = {221,035}}, company:insert_emp(Emp2, 'B/SFR', [otp]).
Emp3  = #employee{emp_no = 104531, name = "Nilsson Hans",      salary =  3, sex = male,   phone = 99495, room_no = {222,026}}, company:insert_emp(Emp3, 'B/SFR', [otp]).
Emp4  = #employee{emp_no = 104659, name = "Tornkvist Torbjorn",salary =  2, sex = male,   phone = 99514, room_no = {222,022}}, company:insert_emp(Emp4, 'B/SFR', [otp]).
Emp5  = #employee{emp_no = 104732, name = "Wikstrom Claes",    salary =  2, sex = male,   phone = 99586, room_no = {221,015}}, company:insert_emp(Emp5, 'B/SFR', [otp]).
Emp6  = #employee{emp_no = 117716, name = "Fedoriw Anna",      salary =  1, sex = female, phone = 99143, room_no = {221,031}}, company:insert_emp(Emp6, 'B/SFR', [otp]).
Emp7  = #employee{emp_no = 115018, name = "Mattsson Hakan",    salary =  3, sex = male,   phone = 99251, room_no = {203,348}}, company:insert_emp(Emp7, 'B/SFR', [otp]).    

%company:all_females();
%company:all_males();

% %Dept
%
%         {dept, 'B/SF',  "Open Telecom Platform"}.
%         {dept, 'B/SFP', "OTP - Product Development"}.
%         {dept, 'B/SFR', "Computer Science Laboratory"}.
%     
%

% %Projects
%company:mk_project(erlang       , 1).
%company:mk_project(otp          , 2).
%company:mk_project(beam         , 3).
%company:mk_project(mnesia       , 5).
%company:mk_project(wolf         , 6).
%company:mk_project(documentation, 7).
%company:mk_project(www          , 8).

%     
%
% % The above three tables, titled employees, dept, and projects, are the tables which are made up of real records.
% % The following database content is stored in the tables which is built on relationships.
% % These tables are titled manager, at_dep, and in_proj.
%
% %Manager
%
%         {manager, 104465, 'B/SF'}.
%         {manager, 104465, 'B/SFP'}.
%         {manager, 114872, 'B/SFR'}.
%     
%
% %At_dep
%
%         {at_dep, 104465, 'B/SF'}.
%         {at_dep, 107912, 'B/SF'}.
%         {at_dep, 114872, 'B/SFR'}.
%         {at_dep, 104531, 'B/SFR'}.
%         {at_dep, 104659, 'B/SFR'}.
%         {at_dep, 104732, 'B/SFR'}.
%         {at_dep, 117716, 'B/SFP'}.
%         {at_dep, 115018, 'B/SFP'}.
%     
%
% %In_proj
%
%         {in_proj, 104465, otp}.
%         {in_proj, 107912, otp}.
%         {in_proj, 114872, otp}.
%         {in_proj, 104531, otp}.
%         {in_proj, 104531, mnesia}.
%         {in_proj, 104545, wolf}.
%         {in_proj, 104659, otp}.
%         {in_proj, 104659, wolf}.
%         {in_proj, 104732, otp}.
%         {in_proj, 104732, mnesia}.
%         {in_proj, 104732, erlang}.
%         {in_proj, 117716, otp}.
%         {in_proj, 117716, documentation}.
%         {in_proj, 115018, otp}.
%         {in_proj, 115018, mnesia}.
%     
% otp = "otp".
% Emp  = #employee{emp_no= 104732,                     
%                             name = klacke,
%                             salary = 7,
%                             sex = male,
%                             phone = 98108,
%                             room_no = {221, 015}},
%          insert_emp(Me, 'B/SFR', [otp]).




Erlang How tos...

http://www.trapexit.org/Category:HowTo

Tuesday, June 5, 2012

Erlang and the mnesia DB

Shows how to use mnesia with Erlang records.

http://ciarang.com/posts/getting-started-with-mnesia

Up and running with Emacs, Erlang, and Distel

How to set up emacs erlang-mode


http://parijatmishra.wordpress.com/2008/08/15/up-and-running-with-emacs-erlang-and-distel/



Monday, June 4, 2012

Erlang recipes

Learning a bit about Erlang now. Found this interesting page with recipes for common tasks.

http://www.trapexit.org/Category:CookBook

Friday, February 24, 2012

install scala distribution

http://www.scala-lang.org/
Download Scala SDK archive for your specific platform, Scala API documentation and sample, unzip archives to the locations of your choice, and configure environment variables

Install the Scala distribution:
http://www.scala-lang.org/downloads

Unix Installation
-----------------

Untar the archive. All Scala tools are located in the "bin" directory.
Adding that directory to the PATH variable will make the Scala commands
directly accessible.

You may test the distribution by running the following commands:

$ ./bin/sbaz install scala-devel-docs
$ ./bin/scalac doc/scala-devel-docs/examples/sort.scala
$ ./bin/scala examples.sort
[6,2,8,5,1]
[1,2,5,6,8]
$ ./bin/scala
scala> examples.sort.main(null)
[6,2,8,5,1]
[1,2,5,6,8]
scala>:quit
$

IntelliJ scala setup

http://confluence.jetbrains.net/display/SCA/Getting+Started+with+IntelliJ+IDEA+Scala+Plugin

Scala emacs/ensime setup

http://jawher.net/2011/01/17/scala-development-environment-emacs-sbt-ensime/

Friday, January 21, 2011

Fix UTF8 character encoding problem in Eclipse

Cross-platform character encoding

Eclipse uses the local character set of the operating system when editing source files. By default, Eclipse on Mac OS X uses the MacRoman character set. If you're working with Eclipse on a cross-platform project, it's better to use UTF-8 character encoding. To specify this encoding for all files in a project, right-click the project in the Package Explorer view, then click Properties. Beneath Text file encoding, select Other, then select UTF-8 from the list, as shown below.

1. Right click on the project and select properties
2. Click on resources
3. Click on Text File encoding button
4. Select UTF8 from the pull down menu

Setting up subversion (svn) on a Mac OSX 10.6 system for Eclipe

Setting up subversion (svn) on a Mac OSX 10.6 system for Eclipe

Thursday, July 22, 2010

Hadoop cluster setup

Hadoop setup
Important Directories
One of the basic tasks involved in setting up a Hadoop cluster is determining where the several various Hadoop-related directories will be located. Where they go is up to you; in some cases, the default locations are inadvisable and should be changed. This section identifies these directories.
Directory Description Default location Suggested location
HADOOP_LOG_DIR Output location for log files from daemons ${HADOOP_HOME}/logs /var/log/hadoop
hadoop.tmp.dir A base for other temporary directories /tmp/hadoop-${user.name} /tmp/hadoop
dfs.name.dir Where the NameNode metadata should be stored ${hadoop.tmp.dir}/dfs/name /home/hadoop/dfs/name
dfs.data.dir Where DataNodes store their blocks ${hadoop.tmp.dir}/dfs/data /home/hadoop/dfs/data
mapred.system.dir The in-HDFS path to shared MapReduce system files ${hadoop.tmp.dir}/mapred/system /hadoop/mapred/system
This table is not exhaustive; several other directories are listed in conf/hadoop-defaults.xml. The remaining directories, however, are initialized by default to reside under hadoop.tmp.dir, and are unlikely to be a concern.
It is critically important in a real cluster that dfs.name.dir and dfs.data.dir be moved out from hadoop.tmp.dir. A real cluster should never consider these directories temporary, as they are where all persistent HDFS data resides. Production clusters should have two paths listed for dfs.name.dir which are on two different physical file systems, to ensure that cluster metadata is preserved in the event of hardware failure.

Monday, June 28, 2010

Upgrading from Mac Leopard to Snow Leopard-clean install-external HDD

1. Buy a WD Scorpio 320GB and put it in an external enclosure
2. Format the drive and USE A GUID partition
-Make a new partition on the external drive to hold the OS/applications -Mac \hdd
-Make additional partitions to hold videos...
3. Insert the Snow Leopard CD and reboot
4. Install Snow Leopard on the external hdd Mac \hdd partition
5. Reboot
6. Update the Mac software Menu->Apple->Software Update
7. Now copy your old login.keychain to the Mac \hdd/Volumes/Users//Library/Keychains
8. Use Keychain Access to create a new keychain file. Then quit Keychain Access. In a shell, copy the old keychain file over the newly created on.
9. Enable root access: http://support.apple.com/kb/ht1528
10. Follow the instructions on this page except ignore the Keychain restoration procedure. Apple personal information transfer instructions

Friday, June 4, 2010

Selecting the right HDD for large data applications

Selecting the right HDD is about more than just getting a good deal at Frys. Not all hdd's are created equal.
Caviar black discussion including motor load and spindles


Drive specs including platter sizes

Friday, April 23, 2010

finding the match boundaries in a Perl regex

Perl FAQ
"Since Perl 5.6.1 the special variables @- and @+ can functionally replace $`, $& and $'. These arrays contain pointers to the beginning and end of each match (see perlvar for the full story), so they give you essentially the same information, but without the risk of excessive string copying."


Regex-Related Special Variables

Perl has a host of special variables that get filled after every m// or s/// regex match. $1, $2, $3, etc. hold the backreferences. $+ holds the last (highest-numbered) backreference. $& (dollar ampersand) holds the entire regex match.

@- is an array of match-start indices into the string. $-[0] holds the start of the entire regex match, $-[1] the start of the first backreference, etc. Likewise, @+ holds match-end indices (ends, not lengths).

$' (dollar followed by an apostrophe or single quote) holds the part of the string after (to the right of) the regex match. $` (dollar backtick) holds the part of the string before (to the left of) the regex match. Using these variables is not recommended in scripts when performance matters, as it causes Perl to slow down all regex matches in your entire script.

All these variables are read-only, and persist until the next regex match is attempted. They are dynamically scoped, as if they had an implicit 'local' at the start of the enclosing scope. Thus if you do a regex match, and call a sub that does a regex match, when that sub returns, your variables are still set as they were for the first match.


if ($lineCopy =~ /$joinedColumns/g) {

my $start = @+[0]; # match start index stored in position 0 in the array

print "MATCH: Found '$&'. lineCopy= " . $lineCopy . "\n";

print "MATCH: atminux = @- atplus= @+\n";
# print "MATCH: Next attempt at character " . pos($lineCopy) + 1 . "\n";
}
else {
print "NO MATCH: line = $lineCopy joinedColumns = $joinedColumns\n";
}

MATCH: Found 'attachments,grinder attachments'. lineCopy= tools,attachments,grinder attachments
MATCH: atminux = 6 atplus= 37
NO MATCH: line = tools,attachments,hammer \& hammer drill attachments joinedColumns = attachments,hammer\ \&\ hammer\ drill\ attachments
MATCH: Found 'attachments,jig saw attachments'. lineCopy= tools,attachments,jig saw attachments
MATCH: atminux = 6 atplus= 37
MATCH: Found 'attachments,metal case'. lineCopy= tools,attachments,metal case
MATCH: atminux = 6 atplus= 28
MATCH: Found 'attachments,miter saw attachments'. lineCopy= tools,attachments,miter saw attachments
MATCH: atminux = 6 atplus= 39
MATCH: Found 'attachments,nibbler attachments'. lineCopy= tools,attachments,nibbler attachments
MATCH: atminux = 6 atplus= 37

Friday, April 16, 2010

TRAC installation including trac HTML form based authentication

trac-admin /home/trac/yo_web_services initenv
chown -R apache.apache /home/svn/yo_web_services
chown -R apache.apache /home/trac/yo_web_services

vim /etc/httpd/conf.d/trac.conf
>>

SetHandler mod_python
PythonHandler trac.web.modpython_frontend
PythonOption TracEnv /home/trac/yo_web_services
PythonOption TracUriRoot /trac/yo_web_services



AuthType Basic
AuthName "trac"
AuthUserFile /home/trac/trac.htpasswd
# comment the next line if using HTML form based login using the trac plugins
# per the trac-hacks page
# Require valid-user


<< touch /home/trac/yo_web_services.htpasswd #Add users to password file htpasswd -m /home/trac/yo_web_services.htpasswd
trac-admin /home/trac/yo_web_services permission add TRAC_ADMIN

service httpd restart

Add the plugins from this page
http://trac-hacks.org/wiki/AccountManagerPlugin

Thursday, April 15, 2010

Mac OSX X11 fix

make sure /usr/X11R6 is empty
cd /usr
ln -s X11R6 X11
now the dylibs will be found...

Monday, March 8, 2010

Fedora 12 Cloudera Hadoop setup + Java JDK

Cloudera's Hadoop distribution

When installing Cloudera's Hadoop distribution on Fedora 12 make sure you install
the Sun Java SDK using the method recommended below.




Sun Java


Fedora Java installation
# yum install hadoop
Loaded plugins: presto, refresh-packagekit
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package hadoop.noarch 0:0.18.3-14.cloudera.CH0_3 set to be updated
--> Processing Dependency: jdk >= 1.6 for package: hadoop-0.18.3-14.cloudera.CH0_3.noarch
--> Finished Dependency Resolution
hadoop-0.18.3-14.cloudera.CH0_3.noarch from cloudera-stable has depsolving problems
--> Missing Dependency: jdk >= 1.6 is needed by package hadoop-0.18.3-14.cloudera.CH0_3.noarch (cloudera-stable)
Error: Missing Dependency: jdk >= 1.6 is needed by package hadoop-0.18.3-14.cloudera.CH0_3.noarch (cloudera-stable)
You could try using --skip-broken to work around the problem
You could try running: package-cleanup --problems
package-cleanup --dupes
rpm -Va --nofiles --nodigest

Cloudera RPM Java installation to avoid the yum install dep problem

Thursday, January 21, 2010

how to speed up Heritrix

I figured out why the Heritrix crawler was running at one page per second.
It was configured it to run using a default Java VM size of 256m.

cat /etc/init.d/heritrix.sh
#!/bin/bash

/opt/heritrix/bin/heritrix --bind=yowb3 --admin=admin:admin

I changed this to 2048m and it seems to be running 10x faster

cat /etc/init.d/heritrix.sh
#!/bin/bash
export JAVA_OPTS=" -Xmx2048m"

/opt/heritrix/bin/heritrix --bind=yowb3 --admin=admin:admin

-----------------

Rates
9.55 URIs/sec (16.1 avg)
246 KB/sec (389 avg)

Load
6 active of 50 threads
1 congestion ratio

Thursday, January 7, 2010

Lucene index writes per minute slow down

28 million data records were indexed.
The write rate for the index was as follows:
Time Writes per minute
0-2 minutes 100,000
...
8 hours later 5,000

See graph-shows writes per second.

Friday, October 30, 2009

Boost smart_pointers need a new object not a pointer

typedef std::string TString;
typedef boost::smart_ptr RefString;

// this does not work->nasty memory leak
RefString rsUnitName = RefString(new std::string((*i)->pType->unitType));
RefString rsInstanceName = RefString(new std::string((*i)->unitObjectName));

refMapUnitNameUnitInstName.get()->operator [](rsUnitName) = rsInstanceName;

refMapUnitNameUnitInstName->insert(std::make_pair(rsUnitName, rsInstanceName));

refMapUnitNameUnitInstName->insert(std::pair(rsUnitName, rsInstanceName));

// this also causes a memory leak.
// pair can't figure out the size of the objects pointed to by the RefString for some reason
// and assigns a default of 8-bits

refMapUnitNameUnitInstName->insert(std::pairpType->unitType)),
RefString(new std::string((*i)->unitObjectName)))
);


// THIS WORKS!
// New the objects inside of the smart pointer wrapper that passed the types to std:pair
refMapUnitNameUnitInstName->insert(std::pairpType->unitType)),
RefString(new std::string((*i)->unitObjectName)))
);