Better with Butta!: October 2016

Wednesday, October 19, 2016

Headers, Caching, and Cookies: Oh my!

I had to pound my fists on the table again when the topic of cookies, caching, headers, and all sorts of jargon popped up when talking about the tech of webapps. Webapps, like online shopping cart apps, are one of those concepts that are simple on paper but can easily become convoluted, especially when terms get swapped around or poorly defined. So I'm running down the list based on experience.

Since this is pragmatist approach, not theoretical smoke & mirrors, I'll freely refer to common software implementations of the concepts, including Apache HTTPD, JBoss/Wildfly, and the all powerful cURL.

Back to Terms & Definitions

Back to basics. Web services are almost entirely based on a protocol that was purposely designed to NOT maintain the state of any communication. aka. HTTP. "It is a generic, stateless, protocol"[1]. Technologists have, since the original inception, extended and modified HTTP to enabled stateful mechanisms. :sigh: Those modifications include:

Stickiness - "stickiness" is nothing more than than an antonym for "load balancing". In other words, you actually _want_ your web client to be serviced by a particular server (affinity). Implementations of stickiness are often done with HTTP Headers! :gasp: Not especially what they were intended for (see Web definition) but implementations in Header include:

Cookie - statefulness; https://tools.ietf.org/html/rfc6265
Sessions / SESSIONID (JSESSIONID/PHPSESSIONID) - more statefulness
jvmRoute - mod_jk / mod_proxy[2]

Authenticated Sessions - obviously not the same as a bare bones web session; anyone who tells you that all you need is to put everything, including sessions, behind HTTPS is full of security blackmail. TLS/SSL over HTTP has nothing to do with authenticating the session; it may (though not always) be authenticating the communication from client to server -- but that is it! An Authenticated Session is not achieved by HTTPS.

Caches - not to be confused with Cookies; client-side (web browser) cache[3] -- often mislabelled as "cookies"; server-side (web site) cache -- mod_cache

Cache everything! Actual web services payload may vary. Based on data from one of the largest content delivery services for the Web, Akamai, the top 3 culprits slowing down web payloads are: 1) images, 2) images, and 3) images.

Load balancing - only makes sense when talking about volume (throughput); has no bounded direct link to latency! You can have a well balanced load that includes a session that is simply slllllooooww (latent). That in no way means the lb (mod_proxy, mod_cluster for Apache HTTPD) is broken but could very well mean a stuck thread on the backend JBoss or DB.

References

[1] https://tools.ietf.org/html/rfc2616. versus https://tools.ietf.org/html/rfc6265
[2] http://tomcat.apache.org/connectors-doc/reference/workers.html
[3] pg. 108 / Section 14.9 Cache-Control, https://tools.ietf.org/html/rfc2616#section-14.9

Security: Smartphones v PC: Deja vu

My officemate's smartphone was stolen. By the time she got online via her laptop (because she was nowhere near home), the remote lock or wipe couldn't find her phone. It had been stolen/wiped.

Smartphone security paranoia popped up again in conversation when a friend overlooked my shoulder and saw how ridiculously long it took me to type my password. He quipped:

"You know that's not secure".
I said, "I've encrypted the phone."
"Oh I can get around that", he said.
"You mean the USB connection and an Android debugger? Yes I suppose you could eventually brute force it but that will take awhile. By then, I'll remotely wipe the phone."

My hope was to avoid a debate about fundamental problems with IT security, and share my belief that a simple risk/reward exercise (including the annoyance of securing a tech device versus its usability), should result in enough security to give hackers at least a big headache if not full on despair. My belief about consumer electronics security is: hackers with a specific vendetta against YOU personally -- so someone wanting to ruin your life -- will invest considerable effort in tearing down layers of security; versus hackers at large looking to exploit maximum reward with minimum effort by targetting smartphones/PCs/etc en masse will skip over consumers with decent layers of security. In other words, most folks have more to fear from their closest friends and family -- who already have access to private or personal information about you anyways -- than anonymous hackers who only know you as an IP address, but only IF you've dotted your i's and crossed your t's.

So smartphone security should target two camps:

local / physical access
remote access

Assume an ex stalks you online, goes covert to get a job as a repairman so she gets access to the building you work at, secretly stalks you at work to figure out when you occasionally leave your smartphone on your desk --, and nabs your phone while you're in the bathroom. OK now your ex has local / physical access to your smartphone. Only a few common security "states" exist for any smartphone:

unlocked screen
locked screen
connected phone
disconnected phone

The interesting thing to notice about this list smartphone states is they apply to computers generally, especially the old PC from yesteryear. Desktops, laptops, smartphones, etc. consumer technologies have many security traits in common, and this commonality means that basic security concerns for computer technology in general applies to smartphones specifically. A PC from 20 years ago, this list for solutions were:

screensaver w/ password

disconnected PC (no LAN)
encrypted disk

20 years ago, it was silly to leave a computer screen unlocked, and trivial to get data from the computer if its disks were not encrypted and someone, like your ex, still had physical access to your PC. The same applies to smartphones today.

My friend's theory was a hacker just inserts a USB cable to bypass a smartphone password. Google Android smartphones have had local storage encryption since 2011, and Apple iPhones have had local storage encryption since 2009, so both major smartphone manufacturers finally caught up to Blackberry RIM (which had been encrypting smartphones since insert date to close that security hole. (One technical point: Android still does not encrypt external or expandable storage -- SDCards or any storage media that is not primary storage -- but Blackberry's later OSes did encrypt removable storage media. Another point of digression: Blackberry smartphones so were highly regarded for their security than some governments outlawed their sale or used their own intelligence agencies against Blackberry's secure communications, such as India.)

Seeing the same risk profiles repeat through history, and techies giving newer yet similar security solutions, yet consumers facing the same pitfalls, I ask myself: have we learned anything as consumers of technology??

Security: a Techie's Steps for Safely Browsing Nowadays

I use these steps for safely browsing nowadays. Ordered sequence matters!

Verify eMail Anti-spam/filter - eMail provider (Gmail/Outlook/Yahoo), eMail client (Outlook), 3rd party (McAffee/Norton/Avast)
Enable automatic updates - OS (Windows, OSX, Android, iOS), Apps (Windows Store, Google Play/Store/Apple iTunes/Store)
Enable secure web browsing - web browser (Chrome/IE/Firefox) - TLS, no SSL3
Use Multiple, difficult passwords - 3tier pyramid approach
Use complex online account hints - the answer to "your first pet's name?" isn't actually my 1st pet
Enable 2FA (Two Factor Authentication) Online
Secure 2FA devices - encrypt smartphone/tablet, remote erase lost smartphone/tablet - iOS/Android
Encrypt local storage - laptops, tablets, smartphones
Encrypt cloud storage - Google Drive, Microsoft OneDrive, Apple iCloud
Filter/Block online ads - web browser extension - ABP/uBlock
Verify public profiles - Pipl
Setup local anti-virus - Avast/McAffee/Norton
Setup local anti-malware - Malwarebytes

An Open Complaint Against Web Analytics and Advertising

I’ve grown increasingly dissatisfied with web browsing speeds but not for the usual reasons. Usually folks complain about staring at a screen that is barely able to load a website because they’re at a congested cafe whose public Wifi is overwhelmed or out at a remote beach where their signal is down to the dreaded “1 bar”. That’s not my complaint. Instead, I blame a deluge of web analytics and online advertising for reducing my browsing experience to digital snail speeds.

The Modern Website

When I browse to any modern website — well OK the choice pick would be a retailer’s website that’s full of ads — I look around the edges of my web browser as the page loads. For example, my web browser is Chrome so when I browse to target.com the bottom of Chrome shows “waiting for facebook.com” and other websites besides Target’s. Facebook and Twitter are just two examples I notice but it seems like a deluge of websites stream across the bottom of my browser while Target’s homepage slowwwwwwly appears on my screen. I know what technology is behind these other sites loading but I wanted to get the scope of its impact on browsing, so I downloaded a web browser plugin called “Ghostery” to tell me how many other websites were hit when I went to target.com. My browser hit over TWO dozen websites just to browse to the target.com homepage.

Modern Web Technologies

For full disclosure, my browser wasn’t loading over two dozen websites in their entirety but don’t let advocates of this technology trivialize it’s behavior as “simple hops” made while going to a website. These “simple hops” are reducing my browsing experience to snail speeds — or as we say in the business: “causing latency”. This latency behavior is the affect of two modern web technologies: 1) web analytics, and 2) online advertising. These technologies enable digital marketing, including online campaigns, and targeted advertising. Both of these web technologies have been in the spotlight by privacy fanatics because of their intrusive personalization of websites but that’s not my complaint here. I’m a huge fan of capitalism and marketing because I’d rather see an ad about a Star Trek movie than a new bra. My complaint is the negative experience I’m having because of the sluggishness these technologies are causing when I’m shopping online. In other words, my online shopping experience is taking a “hit”. The irony with a negative online experience from analytics and advertising is that both of these technologies are intended to enrich our online shopping experience.

Latency Check

In their defense, advocate of web analytics and online advertising will say: a) the “hops” should have been optimized to reduce latency and b) the value of marketing results and targeting advertising outweigh any latency. The later (b) defense is a slap in the face for criticizing technological advances so I’ll simply ignore it. The former (a) defense is technologically sound advice but, sadly, it isn’t working. Using another website tool, Pingdom, I made a cursory check of the time these hops burn up with hitting Target’s homepage. It took 2.5 seconds to get target.com (http://tools.pingdom.com/fpt/#!/jFbn7/target.com). The latency Pingdom calculates excludes the additional time your web browser needs to display that homepage, or what technologists call “rendering time” (when your browser renders the page to your laptop or phone), so any time seen by checks like Pingdom imply an even longer period of time before a person browsing can interact fully with the webpage.

Again, defenders of analytics and advertising technology will chastise me for ignoring the Elephant in the Room: the biggest latency comes from getting and rendering media, like images, because big pictures must be downloaded to the browser so you can see them. They are right but the Devil is in the details. For example, the biggest chunk of time spent getting target.com comes from images (65% of time). Web analytics and online advertising are embedded in the webpage as scripts and like images must be downloaded but executed by the browser instead of displayed. In this example, Pingdom also found that the total size of images is not much bigger than the size of web analytics and online advertising (842kB versus 698kB*). So target.com has nearly as much scripting payload as it does images. There could be scripts other than analytics and advertising but another indication about the cause of latency comes from the amount of time Pingdom’s browsing spent “connecting” and “waiting” rather than receiving the Target’s homepage. Less than half of the time I’m waiting to click around for a new coffee maker on target.com is when my web browser actually receives the webpage. Speaking of “Devil”, as a comparison I pointed Pingdom against the old Internet Explorer is Evil webpage (http://toastytech.com/evil/) … that webpage only took 306 milliseconds to get (http://tools.pingdom.com/fpt/#!/bNJGe7/http://toastytech.com/evil/). Ghostery also found zero web analytics or online advertising on the Evil webpage.

A Magic Pill?

A smart technologist or intuitive advocate of these technologies would expect that a web server should be able to respond faster by working on components of the webpage in parallel. I would agree. When viewed with a very broad lens, modern website design could be described as distributed computing because webpages distribute the analytics and advertisements to agencies that specialize in these technologies. Web developers simply embed these agencies technologies into their website as scripts instead of maintaining all these technologies from their own web server. Yet a broad lens overlooks the details and complications. After digging around in the world of webpage design and digital marketing, it appears parallelizing webpage rendering is difficult. Evidently modern browsers render webpages the way a fax machine scans — so in sequence from top to bottom — so the latency problem first hits when something in this sequence takes a long period of time. A long running step, like a web analytics script embedded in the webpage, will block the rest of the webpage from rendering until the agency’s web server responds and the web browser executes their script. You’ve experienced the results of this “scanning” latency when half a webpage appears on in your browser, when bars or menus appear after a center panel is already visible, and when images appear later than text. Web developers have some tricks up the sleeves for circumventing this blocking behavior, such as event handlers and optimizing the webpage sequence. For example, scripts that execute advertisements and analytics could be embedded at the bottom of webpages and not block other components from rendering so I can start interacting with the website. (I’m no expert so one good source on scripting behavior is at: http://mrcoles.com/blog/how-tracking-scripts-affect-page-loads/) But this isn’t a magic pill. Web browsers react differently to event handlers and even then the script response time is only as fast as the agency’s web server. So another technology, content delivery or edge networks, is introduced to further optimize the typical response time of web servers. Content or edge networks work well but the latency persists because of scale.

1 Bar

Websites are embedding a deluge of analytics and advertising scripts in a single webpage that the gains in event handlers and delivery networks has become moot. Again, target.com called over two dozen external web servers for a single hit to their homepage. The strongest indication of a degrading problem is modern hardware running modern software rendering a modern webpage slower than a decade old webpage. My simplistic comparison of time and payload above hints at this being the root cause. Although I see great value in tracking campaign efficacy, adjusting to customer browsing behavior, and other datasets that web technologies have enabled with web analytics and online advertising, I don’t see the value of these technologies outweighing a sluggish online experience. Our online experience is being smothered by marketing technology. I might as well be browsing with only “1 bar” of signal.

... to be continued.

Predecessors of PaaS - DRAFTv1

In a datacenter far far away, there was "pre-PaaS"...

The pessimists of Cloud and stalwarts of IT should be happy … for now. Gartner’s hype-o-meter put Cloud concepts in decline (or rather the “Trough of Disillusionment”. Optimistics will note that -- at least according to Gartner -- after a period of disillusionment, technology re-emerges in a stage of enlightenment, followed by stable adoption. So let’s do the timewarp again.

Way back in 2010, some smart architects on my team finished implementing an application management and orchestration platform that would rapidly change user expectations. That platform's sales slidedeck, dated 2008 [1], had neither the term “DevOps” nor “Paas” (or any of the *aaS Cloud terms) but “Cloud” is mentioned, although fewer than a dozen times. The product’s slidedeck wasn't filled with modern buzzwords but focused on 1) dynamically orchestrating virtual resources and 2) automating application deployments.

[1]

http://www.slideshare.net/JustinPittman/savedfiles?s_title=fabricserver-technology-overview&user_login=Ivan_datasynapse

I joined that smart team of folks when our goal was to streamline and automate processes that made fast, incremental releases of a distributed, multi-tier trading desk system finally possible. Many of our peers from traditional teams, like Dev and Ops, didn't' really get why we sat inbetween them. It was simple: because the traders demanded changes to their financial products ASAP. The traditional technical teams could not deliver feature requests at a pace that satisfied the business needs, aka. the traders in our case. Users of the system would request a feature or fix through typical business analyst channels and WEEKS or even months for the development lifecycle to finally spit out their change to production. By 2010, our application deployments were mostly automated, and though the final delivery to Prod wasn't yet push button, we really had streamlined the process. We had DAILY AUTOMATED builds.

I had heard the term “DevOps” for the first time while working there and really didn't know what it meant, and our architects faced many cultural obstacles because we were the new kids in the cubes. Many of those cultural obstacles boil down to ongoing silos in organizations but the technology itself has now graduated to getting its own label: PaaS.

Ideas/References:

The reason rapid deployment became a necessity is because small to medium sized businesses want to reach the biggest market / "market capture". These 21st century, technology dependent customer’s no longer accept that any application’s functionality would be interrupted by a large maintenance window set during U.S. “off hours”, nor that productions and solutions would not geo-specific.

Skeptics of PaaS: early on in 2008 Reese blasted the de facto feature of auto-scaling that’s in most IaaS and PaaS solutions. http://broadcast.oreilly.com/2008/12/why-i-dont-like-cloud-auto-scaling.html

Velocity 2009: 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr

http://velocityconf.com/velocity2009/public/schedule/detail/7641

Technical and sociological justification of continuous delivery with PaaS: McCabe's Cyclomatic complexity, and Conway's Law. Cyclomatic complexity - modular programming, use small components together; Conway’s Law:

“If the parts of an organization (e.g. teams, departments, or subdivisions) do not closely reflect the essential parts of the product, or if the relationship between organizations do not reflect the relationships between product parts, then the project will be in trouble... Therefore: Make sure the organization is compatible with the product architecture"

Being Online: Wolf in Sheep's Computer - DRAFT

Being Online: Wolf in Sheep's Computer - DRAFT

Even technologists such as myself are not immune to hackers. Two of my credit cards were replaced because of the back-to-back Target and Home Depot hacks. I’ve avoided outright fraudulent purchases and identity theft but I know folks who've succomb to these. The essential rule for safely browsing online is: perseverance. With new technology come new tricks that protect your modern, online browsing. Gone are the days of simply installing anti-virus and changing your passwords. I’m sharing some these latest tricks here in the simplistic way I can. Let's start with a "Do / Don't Do" list, with more justification as an Appendix.

First up -- Passwords are OUT! DON'T just change your passwords.

DO setup Multi-Factor or 2-way Authentication, DON'T just use passphrases
DO keep Passphrases in Your Head, DON'T blindly trust password managers or vaults
DO Pay Online with Layered Accounts, DON'T pay via Debit Cards, Checks, or Bank Accounts
DO pour on Layers of Security, DON'T just trust one website

1. Multi-Factor or 2-way Authentication

Over a decade ago, technologists productized something that supplanted the old school username/password paradigm. Technologists have blasted passwords as a single point of failure for a long time. It's a weak form of protecting your identity online. We introduced 2-way or Multi-Factor Authentication (2FA/MFA) because it added a 2nd (or Multiple) way for you to login. Imagine 2FA/MFA with this analogy:

You walk up to a closed door with key in hand. You try to turn the door handle but it’s locked so you put your key in and turn. When you turn the handle and push the door, the door doesn’t open. Instead you hear a latch open up next to your ear and see eyes peering out at you through a slit in the door that the latch opened. You hear the girl behind the door say “Violet”, so you say “Yes”. Now you hear the girl turn a hidden deadbolt on her side of the door. You push the door again and it opens!

Putting your key in the door and turning its handle is the old school username/password paradigm. The girl saying a secret word that you acknowledge and her unlocking a deadbolt on the hidden side of the door is the new paradigm: 2FA/MFA. You cannot get the door open with just your key but must also let the girl behind the door see you and respond correctly. Pretty cool!

Modern technology has enabled common devices, like your phone itself, to be a 2nd way of authenticating online because you almost always carry your phone around. Your dumbphone or smartphone receives a secret code, either via SMS txt or a mobile app, when you attempt to identify yourself online. You use this secret code from your phone alongside your username/password in a typical website logon. This new sequence means that a hacker must both: 1) find your username/password, AND 2) steal/unlock your phone.

It's also important to note that the secret codes we’re discussing are temporary, unlike passwords that are seldom changed. They're much like the 007 motto "this message will self-destruct" because a hacker doesn’t have the right secret code if they look over your shoulder or make a guess like they can do for passwords. These 2FA/MFA secret codes are quickly randomized, usually every minute.

A Picture is Worth every Pixel

A phone isn’t always required. Some 2FA/MFA technologies use your web browser as an alternative to make the 2nd verification without sending a secret code. These browser options include asking you to verify a picture code or to answer personal questions, like “What is the name of your preferred charity?” Word of caution on setting up these alternatives to secret codes: personally identifiable information (PII) is not a good choice for 2FA/MFA setup. A hacker can usually figure out your PII. For example, figuring out where you were born is trivial so that question should be avoided.

2FA/MFA provides another layer of headache for a hacker to ruin your life. Gone are the days of simply securing your online identity with a “strong” password. Sadly, online technologies have only recently begun to adopt 2FA/MFA after mulling around the elite halls of computer nerds but I’ve found the most popular online services have gotten aboard.

Here are popular online websites that allow you to setup 2-way or Multi-Factor Authentication via your smartphone:

Social media:

Facebook

Twitter

Apple ID

Google+

Microsoft Live

Banks:

Bank of America

JP Morgan Chase

Barclays

Payment processors:

Visa

Discover

Paypal

Cloud storage:

Dropbox

Evernote

iCloud

Google Drive

OneDrive/SkyDrive

(*from personal experience and from https://twofactorauth.org/)

2. Passphrases in Your Head

"The 4 frogs farted!" is a silly phrase that is more secure than "Jr1981" as a passphrase. Even spaces are valid in passphrases, hence technologists prefer call your login credential a "passphrase" instead of "password".

A Pyramid of Passphrases

Keep passphrases in your head. I'm not abandoning old school password tips entirely but don't let 20th century best practices give you any kind of comfort. We are dealing with a whole new set of technology in the 21st century. Any password that is written down, even in your Evernote/Dropbox/Drive digital notepad, is a sitting duck for hackers.

You should be grumbling about having too many passwords to remember, all while IT nerds demand that you keep creating more and more of them and higher and higher complexity! I’m one technologist who admittedly recommends using only as many passphrases as you can actually remember. Why? There is no reliable way to forcibly extract a passphrase that is just a bunch of neurons in your brain. There are means, even if difficult or improbable, to break into locked drawers full of password notes and even hack into password managers that magically hold all those passphrases for you. Just keep passphrase management to yourself.

I recommend creating a 3 tiered pyramid of passphrases for all your online activity.

Bottom Tier

Imagine the bottom of this pyramid being a ton of websites that you visit that require you to create some kind of account but keep very little or even no personal information about you. Online forums are a great example of this bottom tier of our pyramid. Many online forums don't let you search their topics or threads unless you create an account on their website, even if that account doesn't require your address, or age, or really much of anything other than a username. Keep a throw away passphrase for this bottom tier of websites.

Middle Tier

Next is the middle tier of the pyramid. Here sit a sizable amount of websites or apps that require some personal information about you. Most social media sits here, both in website and app forms. Create a

Top Tier

Finally, the small top tier of your pyramid. These are your online crown jewels -- Bank websites and apps, for example. These websites and apps not only require and maintain personal information about you but access to them has real life affect, like paying bills, filing claims, etc. Some may put their social media accounts in this tier instead of the middle tier. I'm making no rules but the model should fit your risk tolerance. The less tolerance your life has for someone hacking into a tier -- say a violent ex hacking into your Facebook -- the more reason to put that website or app into your top tier. There is one caveat: you must trust your top tier with your most confidential information. If you don't trust the institution or their own online presence, then your account with them should not be part of your top tier. Take online credit card accounts as an example. Credit cards are famous for not only informing their consumers of fraudulent activity but many have even limited your peronsal liability for fraudulent charges with legalese. I've found their commitment to securing your account with them in my credit card contract. For me, that kind of legal commitment implies trust in their online account security.

4. Layers of Security

Try hacking into yourself online. Ask a flesh-and-blood friend of yours to "Unfriend" you from Facebook, Google+, Twitter, or some other social media. After being "unfriended", see how much information your flesh-and-blood friend can find out about you via their own account. This exercise is simulating how online hackers find personal or private information about you, and the results from this self-hack will shock you. Yet, there's plenty of fixes available that don't require the Nuclear Option -- you don't have to go offline. Go back into your social media account and fix the privacy or security settings around each bit of personal or private information your "Unfriend" found. After fixing up your privacy and security leaks, try hacking into yourself again. Rinse and repeat until you've clamped down on any information leaks.

MasterCard SecureCode

Verified by VISA

http://usa.visa.com/personal/security/security-program/verified-by-visa.jsp?n=1

Masquerade Cards

Bank of America has an online service that generates an ad hoc, temporary credit card that is backed by your actual credit card. This service, called “ShopSafe”, ensures that the online merchant’s payment system never actually sees your real credit card data. The Bank processes the transaction on their side by acting as a proxy for your card. This service can also generate a “masquerade” credit card data for recurring, online bill payments.

Pictures at Logon

Yet Another Layer

Services using Paypal Payment / VISA Checkout

eBay

HuluPlus

Netflix

Paypal also offers a Credit Card to front any payments that accept credit, which is basically everyone.