Beautiful Models (and Data)

Work-in-progress (updated Nov. 2015)

Array- or Set-based Coding Style

An Anti-Pattern?

Note: This is not Yet-Another-Article warning noobs about RAM usage when appending strings

This is an exploration of advantages gained when you code everything as an array. (Using Jedi concepts from SmallTalk)

Here’s some guiding principles:

  1. All input is array-like. Even if an array of 1.
  2. Functions should generally accept AND return arrays.
  3. 99 out of 100 devs code suffers from what I call acute schema surplusage syndrome -based models.
  4. Yes, beware fat ASS models - with all the predictable trappings: fragile instance state - so many levers and knobs to mess with, DB transactions, sql locks, async/mutexing (that always works first time), using idiomatic property getter/setters, and your public/private/final/etc usage is solid, right?

  5. So let me take a common problem and shoehorn add some of my set-based heretical musings.

  6. Why is a Product price always a single data point? Why on earth would I make price(s) an Array?
  7. Let’s add this functionality:
    1. New requirements: retailPrice, priceSavings
  8. These changes hopefully look no worse than my sorry attempt:
package net.danlevy.testpool.why.java.has.so.many.dots;

public class Product {

  public String Name;
  public float Price;
  public float retailPrice;
  public float priceSavings;

  public Product(String name, float price) {
    Name = name;
    Price = price;
    this.retailPrice = price;
    this.priceSavings = 0.0f;
  }

  public Product(String name, float price,
      float retailPrice,
      float priceSavings) {
    this.name = name;
    this.price = price;
    this.retailPrice = retailPrice;
    this.priceSavings = priceSavings;
  }

  public float getPriceSavings() {
    return this.retailPrice - this.price;
  }

}

(I’m not replacing the valid pattern of tracking historical prices in tables) 1. I’m not sure about you, but price of just about anything is in flux - just given time. 1. I experience price as constantly fluctuating data point.

  1. For example, even values which seem like singular variables - say a Product class includes var listPrice = 125 - change it to var prices = [50, 100, 125].
  2. Bear with me. That is not likely the final re-factor on that…
function Product({name='widget', prices=[]}) {

}

Foreshadowing: We’re going to go through a concept familiar to LISP, SmallTalk, et al. devs. It’s known by many names, however I prefer Array-based programming.

The issue we’ll examine is deceptively simple & subtle: Naming

I want to avoid the super-fancy-tech-lingo for this article; and hopefully I can illustrate the issue in a more useful fashion.

While covered in exhausting detail before, the subject matter often gets too technical for the novice programmer to draw any practical understanding. You probably don’t need to read this if the following makes sense: No-Sql denormalization strategy, or Boyce Codd Normal Forms

Recommended reading includes:

  1. Book: Code Complete
  2. http://phlonx.com/resources/nf3/
  3. https://en.wikipedia.org/wiki/Database_normalization

The Problem - by Example

Have you ever designed a data model (in code, Sql, or excel worksheets)? Does the following look familiar?

*** anti-pattern - don't copy-paste ***
* User
  - id
  - avatarUrl
  - email
  - passwordHash

* Agent
  - id
  - primaryPhoto
  - agentName
  - agentEmail
  - agentPhoneMain
  - agentEmailPrimary
  - agentPhonePrimary
  - agentAddressLine1
  - agentCompanyName
  - agentCompanyAddress
  - *userEmail* - 'Pointer' to User table ^^^

If this is familiar to you, I’ll bet you:

  1. Feel any change to your app will necessitate hours of arduous debugging.
  2. Fear ANY Changing Requirements schema refactor

The Cost of Bad (Naming) Habits

Let’s examine some of the subtle issues (probably familiar):


Why is naming a field agentEmailPrimary the worst?

For starters, you are not creating an entirely new object unto the universe. Over-specificity has some traps:

  1. ‘Locked’ into highly specific name, means agentEmailPrimary probably make your views and related code 0% reusable, and featuring annoyingly recurring bugs like:
    • Data not syncing between tables (not obvious if user.email needs to propagate to agent.agentEmail or vice-versa - nevermind complexity of manually implementing where & how to enforce this ‘logic’ …)
    • Validation rules/logic are likely duplicated & inconsitent.
    • Increasingly, your project will resemble a shaky Jenga tower.
    • Fragility piles up with every single new file, as an extremely high attention to detail is required for even trivial changes

I know, you probably feel something like…

fuck this

A Solution

// Dan's Recommended Schema Consolidation:

User
  - id
  - role: ['agent', 'lead', 'admin']
  - name
  - phone
  - address
  - email
  - passwordHash
  - company
    - name
    - address

I removed the Agent table, as it didn’t contain fields which were uniquely related to Agents.

All changes were made with these general ideas in mind:

  1. Eliminate unessesary tables. If you have a few dozen tables, this step is mandatory.
  2. Try merge related tables. Important if you are coming from a SQL background to No-SQL
  3. Delete redundant data collection (e.g. remove ActivityLogs table if replaced by Google Analytics)
  4. Try keeping all field names to a single word/noun/pro-noun.
  5. There is no such thing as Agent.agentEmail or Agent.agentPhonePrimary. Period.
  6. By using Highly Specific Names, you cast-in-stone a specific level of code-reusability and durability, well, specifically ZERO %.
  7. Don’t think you are doing yourself any favors with crap like this User.profileSummaryEmail (where ‘profile’ could include contact details for a personal ads site) . This is probably a good point to create a new table, say Profiles which includes Profiles.email.

Work-in-progress (updated Nov. 2015)

Beautiful Models (and Data)

Work-in-progress (updated Sep. 2015)

The issue we’ll examine is deceptively simple & subtle: Naming

I want to avoid the super-fancy-tech-lingo for this article; and hopefully I can illustrate the issue in a more useful fashion.

While covered in exhausting detail before, the subject matter often gets too technical for the novice programmer to draw any practical understanding. You probably don’t need to read this if the following makes sense: No-Sql denormalization strategy, or Boyce Codd Normal Forms

Recommended reading includes:

  1. Book: Code Complete
  2. http://phlonx.com/resources/nf3/
  3. https://en.wikipedia.org/wiki/Database_normalization

The Problem - by Example

Have you ever designed a data model (in code, Sql, or excel worksheets)? Does the following look familiar?

*** anti-pattern - don't copy-paste ***
* User
  - id
  - avatarUrl
  - email
  - passwordHash

* Agent
  - id
  - primaryPhoto
  - agentName
  - agentEmail
  - agentPhoneMain
  - agentEmailPrimary
  - agentPhonePrimary
  - agentAddressLine1
  - agentCompanyName
  - agentCompanyAddress
  - *userEmail* - 'Pointer' to User table ^^^

If this is familiar to you, I’ll bet you:

  1. Feel any change to your app will necessitate hours of arduous debugging.
  2. Fear ANY Changing Requirements schema refactor

The Cost of Bad (Naming) Habits

Let’s examine some of the subtle issues (probably familiar):


Why is naming a field agentEmailPrimary the worst?

For starters, you are not creating an entirely new object unto the universe. Over-specificity has some traps:

  1. ‘Locked’ into highly specific name, means agentEmailPrimary probably make your views and related code 0% reusable, and featuring annoyingly recurring bugs like:
    • Data not syncing between tables (not obvious if user.email needs to propagate to agent.agentEmail or vice-versa - nevermind complexity of manually implementing where & how to enforce this ‘logic’ …)
    • Validation rules/logic are likely duplicated & inconsitent.
    • Increasingly, your project will resemble a shaky Jenga tower.
    • Fragility piles up with every single new file, as an extremely high attention to detail is required for even trivial changes

I know, you probably feel something like…

fuck this

A Solution

// Dan's Recommended Schema Consolidation:

User
  - id
  - role: ['agent', 'lead', 'admin']
  - name
  - phone
  - address
  - email
  - passwordHash
  - company
    - name
    - address

I removed the Agent table, as it didn’t contain fields which were uniquely related to Agents.

All changes were made with these general ideas in mind:

  1. Eliminate unessesary tables. If you have a few dozen tables, this step is mandatory.
  2. Try merge related tables. Important if you are coming from a SQL background to No-SQL
  3. Delete redundant data collection (e.g. remove ActivityLogs table if replaced by Google Analytics)
  4. Try keeping all field names to a single word/noun/pro-noun.
  5. There is no such thing as Agent.agentEmail or Agent.agentPhonePrimary. Period.
  6. By using Highly Specific Names, you cast-in-stone a specific level of code-reusability and durability, well, specifically ZERO %.
  7. Don’t think you are doing yourself any favors with crap like this User.profileSummaryEmail (where ‘profile’ could include contact details for a personal ads site) . This is probably a good point to create a new table, say Profiles which includes Profiles.email.

Work-in-progress (updated Sep. 2015)

AngularJS v2.0 and the Impending Schism

I think we are witnessing the Python 2->3 ‘Conversion’ all over again. AngularJS v2.0 introduces too many changes. Not least of which is TypeScript, which is a big ask amidst the finalization of JS’s latest version: ES6.

Let me say clearly: I love TypeScript. I seccretly wish the TC-39 meetings had produced it… They didn’t. However, They came up with another (totally different), also-awesome spec…

While TypeScript compiles to JavaScript, it doesn’t mean you blindly copy & paste ‘compiled’ TypeScript. It effectively becomes required learning, as to understand annotated AngularJS 2.0 TypeScript.

Now, newbies must climb ‘Mount TypeScript’ before they can even start assembling an Angular app (with some level of understanding).

I have a feeling how this might go…

endless loop

Oh well, I’ll add it to the Newbie training list: somewhere between Basic Shell Usage and Gulp or Grunt? Godsend+Misery!

Anyway, I hope this works out…

everything is going to be ok
Polyglot Redux

Programming Languages Notes

Work-in-progress (updated Sept. 25th 2015)

I’m sure my Miscellaneous Observations have been made before, but here is my list of most interesting languages:

JavaScript

My One True Love, supremely versatile & ubiquitous - the all-around, amazingly-powerful champ! It’s the #1 Most Active/Popular Language on GitHub.com for years running.

I hate to admit it, but for years I foolishly had nothing but scorn and derision for what is now, my favorite language.

ES6 has only increased my addiction love. While pure ES5 will always hold a special place in my heart, each time I use some ES6, I feel that radioactive spider-bite…

There were 4 factors which pushed me into the ES6 Camp:

  1. It’s fun. Seriously. There are tangible gains in beauty, clarity & productivity.
  2. Subjective claims, you say? Let me show you a bit of ES6:
  3. let expired = users.filter(u => Date.now() > u.trialDate)
  4. Now you don’t have to pretend you know how to use Object.create or Object.defineProperty
  5. See examples below
  6. As of July 2015, ES6 is an officially finalized standard now!
  7. Support is Effectively 100%*! … Ok, BabelJS is needed to patch your code so it’s ES5 compatible. Historically JS transpilers have been frowned upon. However, as of late (2014-15) things have changed as BabelJS has become a key enabler/driver of language advancement. Tons of companies including Microsoft & Facebook use it on some of the largest sites around.
  8. Latest versions of Node include the same V8 JS engine as Chrome v45, it’s v4.5

Examples

I’m going to show you what finally made me start drinking that ES6-flavoured KoolAid.

In my recent experience, ES6 helps you write code faster. To the point. Because code is more succinct, appreciably less brain power is needed to sift through and understand your old code (or that of a teammates).

I have regularly seen KLOC savings roughly of 20-50%. That’s like Kate Moss trim!

EcmaScript 5 vs ES 2016 - Demo: Classes, Destructuring, Sexiness

// /services/users.js
class Users {
  constructor(data) {
    this.users = data || [];
  }
  expired() {
    return this.users
      .filter(u => Date.now() > u.trialDate)
  }
}
  • No more tedious code to ‘extract’ and ‘check’ fields passed to a function. Cut to example add():
// /services/users.js
class Users {
  constructor(data) { this.users = data || []; }
  add({name, email, password}) {
    // Store pwd hash, We only need to define 1 explicit `var/let` - the other vars are 'defined' with the `{fields}` wizardry above ^^^
    let hash = getSha256(password);
    return http.post('/users', {
      'name': name,
      'email': email,
      'passwordHash': hash
    }).then(usr => this.users.push(usr)); // append user upon service response
  }
}

// services/user.js
function Users(data) {
  // ensure we're a real thing
  if (!(this instanceof Users)) { return new Users(data); }
  this.users = data || [];
}

Users.prototype.add = function(opts) { // Validate input, We need to extract the 3 fields from opts if ( !opts || typeof(opts) !== 'object' ) { return Promise.reject('add() requires Opts parameter'); } // Unpack data, assuming keys are there var name = opts.name, email = opts.email, pass = opts.password; var hash = getSha256(pass);

return http.post('/users', {
  'name': name,
  'email': email,
  'passwordHash': hash})
.then(function(usr) {
  return this.users.push(usr);
});

} }

 

Jumping on ES6 can feel like going from:

huh

To

wtf

To

#winning

Just keep sifting through the new stuff. Check out string templates, auto this binding, more-sane inheritance…

Node.JS

Rust

Official Site

  • Pros

    • Imagine if there was a language as fast as C and as powerful as Python/C++, yet without the complexity/pitfalls that usually trap even the most skilled devs.
    • In fact I’d guess Rust is roughly as complex as the ES6 spec.
    • It includes a ton of extras:
      1. Essentially Rust transpiles from semi-dynamic syntax into pure C code!
      2. Including all the best practices in C you would probably screw up on, I eventually always do.
      3. Automatically you get:
      4. Auto Memory management (no need for a slow garbage collector!)
      5. Perfectly scoped Object ownership/locking (mutexting & context switching minimized)
      6. Object lifetimes (automatically implemented*, and auto coded like you knew every edge case)
      7. Prevent virtually all run time errors (seriously, your code-paths become explicit: you just can’t overlook a code-path)
    • Oh yeah, it throws in true language extensibility with a sensible ‘macro’ feature.
      • Need Comprehensions? Scala style? Done, and Like Python? Done.
      • Too good to be true? Nah, It gets better:
      • Bleeding edge indicators (github.com stats) reveal Rust is highly competitive or even beating Go (Google’s hot-newish language)
        • About 4K More Stars than Go (currently around 12,200)
        • More total Contributors ( 2x! - 1,071 vs. Go’s 479 )
        • More forks ( 3X! - 2,343 vs. 765 )
        • Number of Open Issues, Loses by a hair ( 2,000 vs 1,730 from Go )
        • Pull Requests (Rust 70+ vs. Go’s 1)
      • I had to triple check the numbers too.
    • Other libraries are very stable due to the constructs & rules of rust.
    • Threading model usable by mere mortals
  • Cons

    • Decent web frameworks are relatively new, untested, and usually undocumented (though they are getting very impressive - as of March 2015).
    • Lots of early pre-1.0 breaking changes

Python

  • Pros
    • Overwhelmingly complete assortment of algorithms are already implemented in Python ( see: scilearnkit, numpy, matplotlib, pil/pillow, etc. )
    • Very Fun to write! Comprehensions and Decomposition are great features and make other languages seem just bloated!
# dummy code: defines a color + pixel-coord -
def pixel(x, y, r, g, b): return dict(x=x, y=y, r=r, g=g, b=b)
# Create a new pixel object and apply to set of vars
x, y, r, g, b = pixel(10, 20, 255, 255, 255)
# Now we can call pixel
  • Tuples and arbitrary sets are so easy

  • Cons

    • Annoyingly, Python 2.x and 3.x are incompatible. The Great Schism continues, so many years later.

Haskell

  • Pros
    • Very rewarding when you finally memorize enough syntax to whip up comprehensions-based expressive patterns
    • You will learn mind-bending code patterns - often somewhat applicable to other languages.
  • Cons
    • Syntax & Patterns can be hard to get used to.
endless loop

SmallTalk-80

  • Pros
  • Cons
    • You will likely never use this language for anything. Zero projects. However it will have more of an impact on your coding style, faster than other functional languages… This should be in the pros list)

Work-in-progress (updated Sept. 25th 2015)

Docker rocks. Boot2docker just sucks.

Overview

To everyone on OSX or Windows: Don’t let Boot2docker leave you with the impression that Docker sucks! It’s really just your antique OS.

  1. Docker is amazing, period.
  2. However it’s rough-around-the-edges, hackey utility, boot2docker - for OS X, Windows and old Linux Kernels - leaves a lot to be desired.

Issues

Boot2docker causes 99/100 headaches compared with using a native docker install locally. I should concede that it wraps several other complicated/flakey technologies: VirtualBox, x-platform Folder Sharing, and also the docker cli command runs in a network-client mode so,

file copying, builds etc take a long time vs. running a native docker server.

Docker can currently only run natively on a Linux Kernel 3.4+ - and the current boot2docker vm actually runs v4. Bottom Line: Install the Latest Debian (w/ xfce or MATE) on your Mac/Windows box, … c'mon those games aren’t helping your code…

Boot2docker Key Commands

When you get error: ‘FATA[0000]’

  • Full error message:
  • Solution: You need some info from boot2docker
    • Run this to get the 3 needed shell environment variables:
boot2docker shellinit
# Copy & paste the exports into the current shell, & retry $(docker info)

Get Docker Server IP Address

boot2docker ip

Now your app on port 3000 is available at something like: http://$(boot2docker ip):3000/

Boot2Docker Quick Start for OS X

  1. In a terminal on your brew able Mac:
brew install boot2docker
boot2docker init
boot2docker up