Migrating App Engine Standard to Cloud SQL v2

I recently discovered that Google Cloud SQL v2 now supports App Engine standard runtime.  This is very exciting for me. I wanted to try out the process and make sure there were no gotchas.

GAE+SQL

  • I created a new Cloud SQL v2 instance.
  • I used my syncing script from my blog post Migrating between Cloud SQL databases to move the data to the new instance.
  • I created a new App Engine module by store a new version of my app using the old code base.
  • I changed the connection string from the old database to the new one. The pattern to make this happen has changed a bit, more down below.
  • That was it.  The new code served up just fine. I served up on the old module until I made the connection string config tweak to the old code base.

New connection string

The new connection strings are only a little different than the old ones, and should require just a change to one string in your config.

The directions will tell you to look for your Instance connection name in the Instance properties of your Cloud SQL Developer Console. There are two patterns that these strings come in. 

  • V1 connection names follow the pattern projectid:instancename
  • V2 connection names follow the pattern projectid:regionname:instancename.

It’s a pretty simple change, but I can see someone accidentally (or willfully) not reading the documentation and getting tripped up on this. The new connection strings require region name; that’s all there is to it.  I’ve tested this on PHP. I assume it works everywhere. But your mileage may vary.  Golang tests are coming soon; I will update when I make that change.

Cloud SQL v1 vs v2

Adri Van Houdt asked me on Twitter: @tpryan @googlecloud what are the major differences [between Cloud SQL v1 and v2]?

It demanded more than a 140-character answer.

Most of the differences are explained in the Cloud SQL Documentation, but to sum up salient details: Cloud SQL v2 has better performance, more capacity, and will probably run cheaper for the same amount of work.

  • Up to 7X throughput and 20X storage capacity of first generation instances
  • Less expensive than first generation for most use cases
  • Option to add high availability failover and read replication
  • Configurable backup period and maintenance window
  • Proxy support (for better security and access using dynamic IP addresses)

Wait what? Bigger, Faster, Cheaper? Is there a reason why I wouldn’t run on Cloud SQL v2?  It’s in beta, so there are some features missing.  Most are not deal breakers, unless, you know, they are for you:

  • No Service Level Agreement (SLA)
  • Only MySQL 5.6 is supported.
  • No point-in-time recovery from backups
  • No standard Persistent Disk (HDD) support
    • But yes Solid-state Persistent Disk (SSD)
  • No automatic scaling of storage capacity
    • But you can do manually increases with no downtime
  • No IPv6 connectivity

The big one here for most larger customers is going to be no SLA.

Not listed here, there is another important feature. If you are an App Engine customer and looked at v2 before, it did not support App Engine. However, that appears to have changed: you can now access Cloud SQL v2 from App Engine standard environment or App Engine flexible environment.

The pricing for v1 and the pricing for v2 are a bit different. The pricing for v2 looks a lot more like the pricing tier structure for Compute Engine now.  

Also, not sure if it is related but we have a new logo now.  Instead of the on-the-nose “SQL” on a blue hexagon, it’s a hard-angled symbol for database on a blue hexagon.  

SQLLogo

So there you have it, bigger, better, cheaper, but no SLA. There are more subtle differences that are explained at length in the documentation, so you should definitely test out and see for yourself. I, for one, am moving my stuff over now that there is App Engine standard support.

Migrating between Cloud SQL databases

I run a bunch of SQL databases on Cloud SQL v1, and I wanted to move them over to Cloud SQL v2.   I like to automate this sort of thing.  I also like to have the new database essentially mirror the old one, until I’m ready to cut over.

I looked into writing a script that could do that with gcloud.  Turns out,  it is incredibly simple. The sql tools in gcloud can import and export directly to Cloud Storage.

PROJECT=[Project ID]
SRC_INSTANCE=[Name of source Cloud SQL instance to target]
DES_INSTANCE=[Name of destination Cloud SQL instance to target]
BUCKET=gs://[Name of Bucket set aside for temporarily storing mysqldump]
DATABASE=[MySQL Database name]

# Export source SQL to SQL file
gcloud sql instances export $SRC_INSTANCE $BUCKET/sql/export.sql 
--database $DATABASE --project $PROJECT

# Import SQL file to destination SQL
gcloud sql instances import $DES_INSTANCE $BUCKET/sql/export.sql 
--project $PROJECT 

# Delete SQL file and export logs. 
gsutil rm $BUCKET/sql/export.sql*

There you go — three lines of commands.  The only thing you need to do to make the new DB work is make sure all of the database accounts are set up correctly on the new server, otherwise application calls will bomb.

Now keep in mind that your mileage (or kilometerage) may vary.  In this case, I am going between MySQL 5.5 and MySQL 5.6, and I had no issues. If there is a reason that your old DB won’t run in the new target, it will fail.  This script also assumes that you are in the same project with appropriate permissions to all.

There’s a lot more you can do with gcloud to manage your Cloud SQL installation. Make sure to check out the rest of the documentation.

 

When to Pick Google Bigtable vs Other Cloud Platform Databases

dbsRecently, Google Cloud Platform announced the availability of an additional database option for our customers: Google Cloud Bigtable.  Now Cloud Platform Developers have another Google managed solution for storing their data. So, when do you use which tool?

The contenders:

Cloud SQL

Are you using some sort of relational database with tables, views, and  indices? Does your application rely on stored procedures and custom table views or write joins? Do you need the certainty of transactions and ACID compliance?  Do you generally both read and write to the data, not in equal amounts, but not lopsidedly one or the other?  If you answered “yes” to a lot of these you probably want to investigate or stick with Cloud SQL, which as its name suggests, is an implementation of MySQL hosted by Google.

Cloud SQL does have all of the limitations of MySQL. Certain types of applications don’t require the complexity of normalized data with no duplication. Other cases require scaling in a manner that SQL cannot handle without additional complexity, reduced performance, or higher cost. Going into the full pros and cons of SQL vs NoSQL is beyond the scope of this post, but rest assured there are reasons that both of them exist, and both are valid choices depending on circumstances.

NoSQL: Datastore or Bigtable

If you are leaning towards a NoSQL solution you now have two Google managed NoSQL choices on our platform. What is the differentiator between Cloud Datastore vs Cloud Bigtable? They are both NoSQL solutions. Both are described as massively scalable.  Both leave you with little to no management (or No Ops for those of you playing buzzword bingo).  The answer lies in four areas:

  • Size
  • Structure
  • Analysis
  • Interface

Size

Bigtable is optimized for mind boggling huge sets of data. Seriously, it is most cost effective when dealing with datasets that start at 1 Terabyte. Datastore can handle large data sets too, but Datastore is performance and cost optimized to handle smaller sets of data too.  Have a few GBs of data? Datastore would be the better call. Have data that might start out small, but grow to a Terabyte in time, still Datastore. Have data that starts at a Terabyte and will keep expanding? Then you’ve started down a path that might make Bigtable interesting. But size alone isn’t the only factor.

Structure

Bigtable stores data in a big honkin’ table. Yeah, the name is a little on the nose, but it’s true. There are rows and columns somewhat like relational database systems but not exactly. But it has a schema, and predefined structure.

Cloud Datastore, on the other hand, is more optimal for ad hoc storage of structured data representing objects.  Basically you define an object and then push it into Datastore. You don’t define a schema, create tables, or set up any other sort of structure before storing a record.

Analysis

Do you need to analyze the data in massive aggregate scale while the database is still online and taking requests? Do you want to run MapReduce on your production data without copying it somewhere for study? Do you want to hook it up to various Big Data analysis toolkits?  If this sounds like what you want to do, Bigtable makes more sense.

Interface

If you are coming to Google Cloud Platform from other technologies, and are working with HBase, Bigtable is for you.  Bigtable is accessible through extensions to the HBase 1.0 API and is therefore compatible with a lot of the Hadoop ecosystem as well as other Big Data tools.

On the other hand,  there are also a few limitations. You cannot join. There is no SQL interface. The API gives you Put/Get/Delete individual records, or you can run Scan operations.

Datastore does not have SQL either, but has an API called GQL that while not exactly the same does abstract querying objects in a way that most SQL developers should be able to quickly understand.

Conclusion

Finally the product page has a great explanation of Bigtable’s relation to other Google Cloud Platform offerings:

Cloud Bigtable and other storage options

Cloud Bigtable is not a relational database; it does not support SQL queries or joins, nor does it support multi-row transactions. Also, it is not a good solution for small amounts of data (< 1 TB).

  • If you need full SQL support for an online transaction processing (OLTP) system, consider Google Cloud SQL.

  • If you need interactive querying in an online analytical processing (OLAP) system, consider Google BigQuery.

  • If you need to store immutable blobs larger than 10 MB, such as large images or movies, consider Google Cloud Storage.

  • If you need to store highly structured objects, or if you require support for ACID transactions and SQL-like queries, consider Cloud Datastore.

In short, there is a lot of awesome stuff about Cloud Bigtable, but it doesn’t mean that it is right in all cases.  It’s a NoOps, NoSQL, Big Data analysis tool, meant to be used at massive scale in conjunction with other Big Data tools.  I recommend that you check out the documentation for Bigtable as there is much more to be found there. And let me know if you need more clarification on anything.