Thursday, December 4, 2014

Is Microsoft the new Apple?


Think Different.

That's the phrase Apple coined back in 1997. It appealed to the upcoming generation who wanted to "fight against the man". In those days, the "man" was Big Blue, the terrible combination of IBM and Microsoft. They were all there was to have for computers. Microsoft beat Apple to the punch, they had invested more in the GUI then Apple had, and gotten a release out first. They were widely adopted, so they quickly grew in support. Businesses and Personal users both used Microsoft Windows, because it's what all the applications were written for, and people all wrote applications for Windows because it's what everyone used.

Apple wanted to change that, they wanted people to use the better operating system, even if there weren't as many applications for it. They started building their own Office-like products, and trying to convince the world that simpler was better.

The new age of Apple

Simpler is better. And when Apple switched to a Unix-based backend to run OS X, they ushered in a new wave of support. They no longer were just the system for the technically un-savy, they gave every geek what we wanted, a nice GUI on top of an existing platform we already knew how to use.


Apple continued to grow and innovate. Eventually they launched the first iPhone; a smartphone designed for the average person. It wasn't the first Smartphone, but it was the first to include an App Store, and a completely new platform designed to help developers put their software in the hands of users on the go. Apple once again was at the forefront of innovation because they were the underdog. They had to innovate, and produce top-quality products to get people to switch away from existing devices.

Apple as the new Giant

After the iPhone launched, Apple continued to innovate for a few years. It's hard to pinpoint exactly when (although I believe it was directly related to the death of Steve Jobs), but at some point after Google launched the Android OS, Apple started copying instead of innovating. Instead of pushing forward with new ideas, and building everything with their users best interests in mind, they decided to bow to pressure and do what others had been doing. They made phones larger, more unmanageable for the average single-hand usage, and even started selling cheaper versions of the phone.

Apple has been riding on their brand for the past few years. They have been stale, and they have been adding very single-use cases to their OS. You can tie an app into Siri, but only in a very specific way, through Home Automation APIs. You can share data between apps, but only through "Health Kit" which is designed to share health-related data. Everything so far has been piecemeal, instead of a better, more general solution.

Microsoft is the new little guy

Everyone forgot about Microsoft. They bought Nokia in an attempt to enter the Smartphone market. They built an entirely new OS around touch and widgets. They looked at how users want to use their devices, and have even produced a new Microsoft Band smart watch that is specifically advertised as a way to stop technology from massively disrupting your everyday life. They've built a new OS shared between a desktop, tablet, and phone (although only the "new" apps work on all three). The problem with their software right now? Nobody writes apps for them.

Because nobody writes apps for them, nobody wants to buy them. Because nobody wants to buy them, nobody writes apps for them.

Wait, doesn't this sound familiar?

Where do we go from here?

I've always loved Apple products, but I'm not an Apple fanboy. Apple currently has the best PCs on the market. They have the best Smartphones on the market as well. Microsoft is catching up quickly, and they've already developed a superior OS, it's just not quite ready for prime-time.

The concepts are there, the ideas are there, but Microsoft needs to execute as well as Apple did back in the 90s. They need to encourage people to Think Different. The next time you're thinking about dropping $650 on an iPhone or Android (and yes, you're really paying $650 despite what your carrier tells you), take a closer look at the $150 Windows phone instead.

As for me? I have no plans to switch my desktop to Windows unless they also embrace Unix. I'm a coder, and I use the terminal in my every day life. We develop iPhone apps because that's where the money (and users) are at. 

However, my next phone, in about a year or two, will most likely NOT be an iPhone.

Wednesday, December 3, 2014

Logentries - A different kind of Logging service

NOTE: I originally reviewed Logentries in a post I did for TheServerSide.com. In it I had confused this with another logging service I had checked out which was an Application-level logger only. Logentries does accept logs from Syslog.

Logentries was actually a recommendation to me after I wrote a blog post about our search for log services. After checking it out for a little while, I realized pretty quickly that the service was not for us. Log entries has a very unique approach to logging, both for Applications and Systems. Instead of deciding what to SEND to Logentries, you have to decide what to keep Searchable in Logentries.

That makes for a very unique approach to logs. With their new Unlimited logging service, you can send as much data as you can produce, from as many servers as you have, and still only pay for what you actually process and index. That means if you want to send all your debug logs, but you don’t need to make them searchable except for that one period of time where you had an issue, you can do that.

Pros:
  • Lots of features (live-tail, alerting, etc)
  • Send everything, index what you need
  • “Unlimited Logging”
  • Parse your logs how you want them
Cons:
  • Longer initial setup to parser your logs
  • Need to tell it what you want to index
  • Expensive depending on how much you want to index

Tuesday, July 29, 2014

Algolia understands Service

The final "S" in SaaS is important. Sometimes the difference between two products can be that Service. In Algolia's case, they screwed up. They made a mistake that caused them to lose all of our documents in one of our indexes. Within a few minutes of contacting them, they had figured out what happened, and were working to correct it. Unfortunately they couldn't restore our lost documents, and they felt very bad for this happening.

The Contract doesn't matter


Despite not having an official SLA with them, they knew that we were a customer, and one they wanted to treat right. They didn't point to a document and say "well the service was only down for 1% of the time, so we're only going to give you back 5% of your bill". They are a small company, with a great product, and a fast-paced innovative team. They didn't hide behind an SLA, they specifically went above and beyond to make sure we were happy.

Algolia owned up to the mistake.

We all make mistakes, and they owned up to it, and reimbursed our entire bill for the month. They admitted it was their fault, and they were willing to do anything and everything to apologize and minimize the impact on us.

Algolia understands customer relations 

Algolia Cares, unlike Splunk. They responded quickly, found the problem, and gave me steps to correct it, as well as their assurances that they have implemented test cases to prevent it from happening again. It wasn't catastrophic, our systems didn't go down because of it, but it did inconvenience us, and they were well aware of it.

Happy customers are repeat customers

I'm incredibly happy to see Algolia spend so much time making sure we feel satisfied with the service. They're constantly improving things and taking our feedback, and giving us features that we said we'd like to see. They're keeping us in the loop, and when something bad happens they fix it.

If you need a distributed search system, I strongly suggest Algolia. It's fast, the service is great, and the system is incredibly reliable. We've had only a few issues over the past few months we've been using it, and none of it was true downtime. They looked at our specific use case and have made improvements to their backend systems to make sure our system runs fast, not told us to re-do how we use the system to make it run fast.

They take the initiative to make the system work the way we want it to. They don't force you to use something they've structured for a specific purpose.


SaaS - The "Service" is important

I really hate when you find a brilliant piece of software, something that will help aid in your daily productivity, and then find out you can't use it because of mismanaged business practices. For a while now at Newstex we've swapped between multiple logging solutions, until we had finally landed on SplunkStorm. SplunkStorm was designed to be the SaaS version of Splunk, all cloud-based, allowing us to push all of our logs directly to it and not worry about managing servers, or upgrades to software. It just worked, and it worked pretty well. When they finally announced pricing, we were quick to jump on board. We picked up 500GB worth of logging to keep our logs around for 30 days. It did graphs, post-mortem analysis, and even allowed us to post-process messages after they had already been sent to the system. We could get Analytics out of it without a problem, and they even added support for Alerting (although in beta).

Then the business team got involved

Unfortunately Splunk is a public company, and they decided that SplunkStorm wasn't going to be supported anymore (I still have several outstanding support emails from months ago that will never be responded to). You can't pay for any more they 50GB of storage (which is now just free). It's a "lite" version, and you also can't get Alerts added to new projects. We quickly exceeded the 50GB/month storage quota, but couldn't get an upgrade without going to "SplunkCloud". 

In theory, this was a great idea, Splunk Cloud was just Splunk run as a Service, not a modified version. That meant we could get an API into the system, and integrate with third party services like Tableau. This was great, we thought, so we started the process to see what it would take to get the service.

Splunk does not want to sell to small companies

They advertise themselves as the "Enterprise" logging solution, and they don't want to sell to "small fish". They say they want new clients, but they aren't willing to work with you. The worst part about it was they have absolutely no idea what a remote company is. They passed us off from salesperson to salesperson because they couldn't figure out "who's region we belong to". The first person we talked to wasn't going to help us because he wouldn't get credit for a sale.

Competition among employee's is good for the company, bad for the customer.

I just wanted to buy

We didn't care who got "credit", we wanted to buy. We were ready to give these people our money, $10k/year (which was higher then we really wanted to spend, but it's the cheapest they would give us). We wanted to know where to sign up, they had no idea what that meant. Their system only worked by sending a P/O, and they could only bill annually.


Over a month later

4 weeks went by, still with very little idea of what was going on. We asked to bill quarterly, they didn't respond for another week or so. Finally at the end of the month, we get a response from someone who was ready to talk about sales, but still hadn't gotten final approval to bill us quarterly, just that they were "working on it".

Too little, too late

At this point we'd decided to switch and check out the new Loggly Gen2 systems. We're still in the process of switching over, but the sign up process was painless. We could fill out a few sliders to configure exactly how much we wanted to spend, and enter a credit card and you're done. That's a SaaS system. We did have a small hiccup in the beginning because we had an old account, but a quick tweet got their response within a few minutes, and a resolution within a few hours.

So we're back to Loggly, at least for now. Why? not because Loggly's software is better, but because the SERVICE they have is better. Loggly started as the service for small companies, and they get it. In a small company every resource is important.

Thursday, July 17, 2014

How NOT to do 2 Factor authentication (MailChimp, this means you!)

UPDATE: You can enable QR code/Authy for AlterEgo

Thanks to a co-worker who discovered how to enable a QRCode based authentication for AlterEgo. After logging into AlterEgo via the website, you can go to "Integrations":


Under "Google Authenticator" choose "Connect":


This will generate a QR code you can attach to Authy, or any other standard Software MFA device!

How NOT to do 2 Factor authentication

Two factor authentication is great. It's the latest craze, but it's also a good idea. In general, the password is obsolete. Anyone can guess or brute force a static password, and making people change a password is lame. They forget, which means you need to have way to let them reset.

If it's something they're typing on mobile devices, it's probably going to be pretty weak, and the more you have to type it, the less secure it will be.

A multi-factor (or Two-factor) authentication token solves much of these problems. People will always make insecure passwords, a second form of authentication is key. There are three main types of authentication:


  • Knows Something (Password)
  • Has Something (Authentication Token)
  • Is Something (Firewall)
Breaking into the "Has Something" is critical, but it's also important to make sure it's not an obstacle. There are standards out there for how to do authentication tokens. Almost everyone generates a QR code that you can scan on your mobile application, and/or just uses SMS.

Yes, this does mean that there's a QR code out there that someone could hijack, but hopefully that QR code is not printed, but instead kept securely on the user's device. If you're like me, you use Authy, which does back up your MFA tokens, but also requires you to input more information when you need to restore, only allows on one device at a time, and requires a secondary form of MFA if you do need to restore (such as an SMS).

Other providers, such as RSA, allow for physical MFA tokens. These are by far the most secure, but also expensive, and a hassle if you have a bunch of them. I have one for my 401k, PayPal, and AWS account. Everything else is a Software Auth Token.

Google's MFA does not do backups, and if you upgrade your phone you lose it all. Not as ideal, but still not as bad as....

Mailchimp You're doing it wrong

MailChimp introduced MFA. Pretty great right? You don't want someone getting ahold of your client list, that could be pretty bad.

But they don't use a standard like a QR code, a physical token, or just SMS. Nope, they use a third-party company called AlterEgo

First off, when you search for "Alter Ego" in the app store, this app isn't what comes up. That's pretty bad itself, but not the worst part.

The worst part? They don't do two factor authentication like anyone else. The app is a mobile-browser package, and you can tell. It is NOT optimized for touch screens, let along small devices. It requires a login of username and password... wait isn't this what the MFA was suppose to be solving for us?

Worse yet, while it DOES have time-based codes, those codes are also one-time use. The interface doesn't have a simple way to let you generate a new code until the old one expires, even if you've already used it. In MailChimp, you often have to re-login all over again (another issue) including when you add new people, or are setting up your account for the first time. This means you're typing in your AlterEgo token multiple times within the 1 minute window that the token takes to "expire". That means you have to wait.... you can't just re-generate a new token, even though the one on the screen no longer works.


PLEASE, MAILCHIMP, DROP ALTEREGO!

It does not make me feel more secure. In fact it breaks your normal workflow, and makes your service difficult to use. There is no reason you can't generate a QR code and support every other type of MFA out there, or even just use SMS. You have SMS as a backup, but you can't set it up that way just with SMS.

Please please please, don't continue to require AlterEgo.




Wednesday, July 2, 2014

Google Cloud, still not ready for the real world

With all of the new buzz around the fancy features that Google has been launching lately, I decided to take a stab at using Google Cloud for one of my new projects at Newstex. The project is simple and isolated, so there was nothing to hinder me or tie me into AWS, and nothing to prevent me from testing out the waters.

What I discovered, however, is the reason why many people are still primary focused on AWS instead of Google Cloud; there's a lot of shiny features, but it's missing the important parts.

Security

Let's start with the most important part of any Cloud Infrastructure, Security. It's absolutely paramount that you can control access to resources, and make sure you contain who (and what) has access to different elements of your cloud environment. For example, you don't want to allow a process to have access to start and stop servers if all it needs is access to read files from your storage system. You also don't want to give access to every storage bucket to your new client that wants access to your files to download, just the specific files they should have access to. You also don't want to give the new DevOp you just hired full write-level access to kill all servers and horrendously screw things up before they know what they're doing.

For all of these things, Amazon developed IAM. You can securely control exactly what access any given set of credentials provides, in some cases down to an incredibly low-level of granularity, such as limiting someone to a specific prefix in an S3 Bucket, or a certain sub-set of items within a DynamoDB table. The granularity level is extreme, and it's very easy to construct access control rules that allow you to prevent abuse, as well as just misuse.

Google Cloud lets you split access apart by Project. That's it.

APIs and Documentation

Here's an interesting test for you, try to figure out how to write a row of data into a BigQuery table from a script in Python. Better yet, try finding out how to do it in Node.js. While Google does consider Python a primary programming language, if you're just trying to access an API through a script, it takes a lot of bootstrapping just to get that to work. There's no simple API like "boto" to handle all the automatic magic required to connect. The API keys don't even work with BigQuery, and the API keys aren't even secure (or really secret since you're sending them along with every request). There's no simple signed url scheme, it all relies on OAuth2.

Have you ever worked with OAuth2? It's a complete pain in the ass. Not to mention it's designed for a web-based workflow, not a server-side script. So when you do finally manage to get that script working, you have to go to a browser to authorize your request, then store the access keys for future use. Oh, and those expire. That requires more manual intervention.

It's not a usable API if it requires manual intervention. Yes, you can use it from within AppEngine, or within Google Cloud (with some work), but what if I don't want to run it within the Google Environment? Why am I so tied down to using their systems? Even then, it's not very obvious how to get things to work.

This is a fault of two aspects, the API is not well designed, and the documentation does not describe it well. A simple REST API with JSON would solve this issue, provided there was also decent documentation. There should be documentation and API wrappers to help with at least the most popular languages out there. I shouldn't have to resort to looking through python code to find out how to authenticate myself.

Oh, and if you offer an API Key, make it work with everything, and make it secure.

Consistency

Google is in a state of transition right now. It's very obvious if you ever talk to a Google engineer that the teams at Google are in a state of unhealthy competition. The BigQuery folks don't like the Cloud SQL Folks, and the Compute Engine folks are at war with the AppEngine folks. This is very obvious if you look at the design of each system. The "Cloud Console" doesn't have everything in it yet. You have to go out to the old consoles a lot to get the full functionality you want (like certain APIs, or to access certain features of BigQuery).

Yes, this will change, but it is a symptom of a larger problem at Google. There is a lot of internal politics going on that bleeds out and hits the consumer. It's unfortunate that the boys at Google don't know how to cooperate and realize that when one team succeeds, the entire company does (just look at how long it took for Android to adopt Chrome as the default browser).

With Material being released, hopefully this will change soon. This is simply a growing pain issue. Google will solve this eventually, but for now we'll have to wait until they solve their internal disputes before as a consumer we can get a good experience.

Quantity of Services

The quantity of services available at Google is pitiful. Google is very single-minded, and they have specific tools designed around an exact workflow. AppEngine is designed around very specific workflows (Web applications and background processes). BigQuery is designed around a very specific process (write-only datastore). What if you want to queue messages? What if you want to store versioned files? What if you need to store petabites of metadata in an SQL-Like environment? How about DNS with location-based routes?

For those solutions, you're on your own. Yes, there are OpenSource solutions you can run, and you can even run them on Google Cloud, but they don't really give you any simple solutions to do so.

Shiny features, not hard-level power

In conclusion, there are a lot of shiny features that Google Cloud offers (such as hot-VM Migrations), but this is not enough. They did not focus enough on the core requirements, just the differentiating factors. Yes, there are some very nice features of Google Cloud, but it is not a complete solution. There is still too much missing to make it a real competitor to AWS.

What does google need to do?

To get me to switch, there are a few things Google needs:
  • Easier access (Signed API Keys)
  • Granular Access Control (by API Key, by service, by access type)
  • Better documentation
  • Better API Wrappers (Node.js, Python, Go)
  • More services, or easy-access to open-source alternatives such as:
    • Redis
    • Memcache
    • Rabbitmq
  • More tutorials, with different use-cases, such as:
    • Video Upload/Encoding
    • Translation
    • Image Manipulation
    • Background Processing of text files
    • Google+ Sentiment analysis
    • etc...
And don't assume everyone is using Java. Do examples in other languages, even in Go to show off how nice of a language it really can be.

My Plea

Please Google, keep on innovating, and make the developer tools better. BigQuery is an awesome tool, but accessing it is not.

I am not an AWS fanboy, it's just the only service that works. I would gladly use Google Cloud if it offered a competitive alternative. It just doesn't right now.

Wednesday, April 23, 2014

CloudSearch vs Algolia - Battle of the Search engines

It's pretty clear the new CloudSearch system isn't heading in the right direction, in fact they're starting to lose some of their important features and instead it's becoming a commodity. In addition to doing a lot of searching on the new CloudSearch to see if it had any redeeming factors, I also started looking around for alternatives. One in particular came out as having a lot of potential, Algolia. Silly name, but amazing product.

Lets break it down.

Indexing

CloudSearch v1 allows you to send random JSON fields to it even if they aren't in the index. However, anything not configured won't be searchable until you do configure it. Still, every field can be multiple, and every field can be added to the index later if needed. CloudSearch v2 does not let you send extra fields, and instead tosses an error and refuses to index anything with extra fields. That means that if you want to start searching another field, you have to re-submit all your documents after adding it to the index.

Algolia, on the other hand, accepts any arbitrary JSON, including nested data. Yes, you can send something like this to Algolia and it will just figure it out:

{
  "author": {
     "first_name": "Chris",
     "last_name": "Moyer",
     "id": 12345
  },
  "title": "Building Applications in the Cloud",
  "toc": [
     {
        "number": 1,
        "title": ...

The previous example would index "author.first_name", "author.last_name", "author.id", "title", "toc.number", and "toc.title". You can even go multiple levels deep and it just works.

All of this without having to pre-configure an index. Yes you can choose fields and how things are indexed, but you don't have  to do so. It tries to figure everything out for you automatically, and it does a pretty good job.

Winner: Algolia


Searching

Both versions of CloudSearch allow complex boolean queries. CloudSearch v1 allows for a prologue-like format of searching:

   (and (or author:'Chris' title:'Building Applications*') content:'Cloud Computing' (not 'Weather'))

This lets you combine for some very complex logic, and gives you full power to search full-text throughout your records. You can also do simple text-based searches, and define what fields those text-base searches by default. You can combine using wildcards or full-words, as well as phrases grouped by quotes ("this is a phrase"). With v2, this syntax changes slightly, but still allows you to do some very complex querying, and even adds in Location-based searching (Lat-Lon).

Algolia does not allow for complex boolean query searching. You can make full-text searches, and filter against Facets. You can not group searches in the way you can in CloudSearch. You can not do negation searches. You CAN do some OR logic with Facet Filters, but not nearly as complex as CloudSearch offers.

Algolia also allows you to search multiple indexes at once, with CloudSearch you do not get that option.

Both systems offer Lat-Lon Searching, Faceting, and Numeric Range filtering. Both systems also return results relatively fast (within a few milliseconds).

Winner: CloudSearch


Analytics

CloudSearch added support for Search Analytics. These analytics come in three different reports, Search Count, Top Searches, and Top Documents.

The most interesting one is Search Count:

All reports are also downloadable to a CSV which can be used for further analytics. Most of the data is very raw and not very useful right out of the console.

Algolia, on the other hand, provides a weekly email that shows many more stats, and the console for their system includes quite a few bits of Eye Candy.


They also provide a nice dashboard which contains a lot of useful performance stats, as well as a general "health" of your indexes:


There's also a full set of stats available on each index including the number of operations, searches, and records, all by time series.

Winner: Algolia


Setup/Administration

CloudSearch requires quite a bit of initial setup. You have to provision your domain, initialize some indexes, and then wait about 30 minutes for each domain to be created. You also have to configure IP addresses that can access the domains. This is quite contrary to other Amazon Web Services, and does not support IAM or Credentials at all.

Algolia, on the other hand, does support Access Tokens, and even supports setting up custom Credentials with varying levels of permissions on different indexes. It does not allow you to edit the permissions after the credentials are generated, but you can always revoke credentials and send out new ones. As for setup? There is almost none. you can create a new index in seconds, you don't need to start with anything. You can even do so from the API by just sending documents, and then configuring a few things like default sort order, facets, and default search fields.

Additionally, when you change an index in Algolia, it happens nearly instantaneously. With CloudSearch you have to re-issue an "Index Documents" request, which temporarily puts your domain in a partially-working state (searches might return out dated results), and takes anywhere from a few minutes to a few hours. It also costs you.

Algolia lets you clear a domain instantly and those records are gone immediately. This makes resetting an index very simple. With CloudSearch, you have to remove each story individually, and then issue a new Index Documents request to get the size of your domains down again.

Winner: Algolia



Pricing

CloudSearch v1 was entirely based on the A9 search system. It was built to run on large servers, and designed around speed of search results. It works very well, but requires a lot of resources, and thus is costly. You also can't tell ahead of time how much storage you'll need, and the transparency is very low on how much you're using. The domains automatically scale and you don't have much control over it.

CloudSearch v2 is based on a different system, and does significantly reduce the costs, however it still is expensive, and doesn't really let you know how much storage you're using. You can give the domain hints to how large of a domain you want to start with, but you don't get any control over where the domain goes.

With CloudSearch, all you can ever see is how many documents are in the domain, and how many servers of what size are being used. It automatically scales with you, but the cost is very high.

With Algolia, you pick a plan. Up until you decide to go with an Enterprise Plan, you're paying for the number of documents in your domain. About $450/month gets you 5 million documents. For me, that's about an XXLarge domain on CloudSearch, which is about $800/month on AWS, plus indexing costs. Want to make sure it's reliable? Then you have to turn on Multi-AZ, doubling the cost to $1600/month.  CloudSearch v2 has been known to reduce sizes by up to 50%, but even at that with Multi-AZ enabled you're looking at about $800/month. Plus you pay for Batch Uploads, and Document Indexes if you need to run those. 

Algolia also shows you right off the top how much of your quota you're using, and you can easily remove documents. When you remove a document it's gone right away, you don't have to fiddle about trying to get your domains to scale down in size. If you want to go Enterprise, you pay by the Storage Size, but you can get 150GB of storage index, mirrored onto 3 dedicated servers, for about $1,750/month. In my example, that will fit about 30 Million records pretty easily, which costs us right now about $6k/month. That's a pretty big difference.


Winner: Algolia


Conclusion

In total, that brings Algolia to 4 wins, with CloudSearch only at 1 win. Still, that one win is on Search capabilities itself. Algolia was designed around making things fast, and require very few resources. They're slick, powerful, and new. They have a long way to go but they're already winning over CloudSearch. For most of my needs, Algolia wins easily over CloudSearch, even without the complex querying capabilities.

If for nothing other then Cost alone, Algolia is vastly better then CloudSearch. The team is small, but the product is solid, and I can't wait to see where it goes next.

Have you worked with Search as a Service solutions? What other systems have you found useful?

Wizpert