and has easy to use multi-regional support at a fraction of the cost of what it would take on AWS. I directly point my NAS box at home to GCS instead of S3 (sadly having to modify the little PHP client code to point it to storage.googleapis.com), and it works like a charm. Resumable uploads work differently between us, but honestly since we let you do up to 5TB per object, I haven't needed to bother yet.
Again, Disclosure: I work on Google Cloud (and we've had our own outages!).
By boulos 8 years ago
Apologies if this is too much off-topic, but I want to share an anecdote of some some serious problems we had with GCS and why I'd be careful to trust them with critical services:
Our production Cloud SQL started throwing errors that we could not write anything to the database. We have Gold support, so quickly created a ticket. While there was a quick reply, it took a total of 21+ hours of downtime to get the issue fixed. During the downtime, there is nothing you can do to speed this up - you're waiting helplessly. Because Cloud SQL is a hosted service, you can not connect to a shell or access any filesystem data directly - there is nothing you can do, other than wait for the Google engineers to resolve the problem.
When the Cloud SQL instance was up&running again, support confirmed that there is nothing you can do to prevent a filesystem crash, it "just happens". The workaround they offered is to have a failover set up, so it can take over in case of downtime. The worst part is that GCS refused to offer credit, as according to their SLA this is not considered downtime. The SLA [1] states: "with respect to Google Cloud SQL Second Generation: all connection requests to a Multi-zone Instance fail" - so as long as the SQL instance accepts incoming connections, there is no downtime. Your data can get lost, your database can be unusable, your whole system might be down: according to Google, this is no downtime.
TL;DR: make sure to check the SLA before moving critical stuff to GCS.
I've used both Google Cloud and AWS, and as of a year or so ago, I'm a Google Cloud convert. (Before that, you guys didn't at all have your shit together when it came to customer support)
It's not in bad taste, despite other comments saying otherwise. We need to recognize that competition is good, and Amazon isn't the answer to everything.
By JPKab 8 years ago
The brilliance of open sourcing Borg (aka Kubernetes) is evident in times like these. We[0] are seeing more and more SaaS companies abstract away their dependencies on AWS or any particular cloud provider with Kubernetes.
Managing stateful services is still difficult but we are starting to see paths forward [1] and the community's velocity is remarkable.
K8s seems to be the wolf in sheep's clothing that will break AWS' virtual monopoly on IaaS.
[0] We (gravitational.com) help companies go "multi-region" or on-prem using Kubernetes as a portable run-time.
I have a component in my business that writes about 9 million objects a month to Amazon S3. But, to leverage efficiencies in dropping storage costs for those objects I created an identical archiving architecture on Google Cloud.
It took me about 15 minutes to spin up the instances on Google Cloud that archive these objects and upload them to Google Storage. While we didn't have access to any of our existing uploaded objects on S3 during the outage, I was able to mitigate not having the ability to store any future ongoing objects. (our workload is much more geared towards being very very write heavy for these objects)
It it turns out this cost leveraging architecture works quite well as a disaster recovery architecture.
By blantonl 8 years ago
Opportunistic, sure. But I did not know about the API interoperability. Given the prices, makes sense to store stuff in both places in case one goes down.
By sachinag 8 years ago
Not poor taste at all. Love GCP. I actually host two corporate static sites using Google Cloud Storage and it is fantastic. I just wish there was a bucket wide setting to adjust the cache-control setting. Currently it defaults to 1 hour, and if you want to change it, you have to use the API/CLI and provide a custom cache control value each upload. I'd love to see a default cache-control setting in the web UI applying to the entire bucket.
I also want to personally thank Solomon (@boulos) for hooking me up with a Google Cloud NEXT conference pass. He is awesome!
By nodesocket 8 years ago
Hopefully you're still there even though S3 is back up. I have an interesting question I really, really hope you can answer. (Potential customer(s) here!!)
There are a large number of people out there looking intently at ACD's "unlimited for $60/yr" and wondering what that really means.
I recently found https://redd.it/5s7q04 which links to https://i.imgur.com/kiI4kmp.png (small screenshot) showing a user hit 1PB (!!) on ACD (1 month ago). If I understand correctly, the (throwaway) data in question was slowly being uploaded as a capacity test. This has surprised a lot of people, and I've been seriously considering ACD as a result.
On the way to finding the above thread I also just discovered https://redd.it/5vdvnp, which details how Amazon doesn't publish transfer thresholds, their "please stop doing what you're doing" support emails are frighteningly vague, and how a user became unable to download their uploaded data because they didn't know what speed/time ratios to use. This sort of thing has happened heaps of times.
I also know a small group of Internet archivists that feed data to Archive.org. If I understand correctly, they snap up disk deals wherever they can find them, besides using LTO4 tapes, the disks attached to VPS instances, and a few ACD and GDrive accounts for interstitial storage and crawl processing, which everyone is afraid to push too hard so they don't break. One person mentioned that someone they knew hit a brick wall after exactly 100TB uploaded - ACD simply would not let this person upload any more. (I wonder if their upload speed made them hit this limit.) The archive group also let me know that ACD was better at storing lots of data, while GDrive was better at smaller amounts of data being shared a lot.
So, I'm curious. Bandwidth and storage are certainly finite resources, I'll readily acknowledge that. GDrive is obviously going to have data-vs-time transfer thresholds and upper storage limits. However, GSuite's $10/month "unlimited storage" is a very interesting alternative to ACD (even at twice the cost) if some awareness of the transfer thresholds was available. I'm very curious what insight you can provide here!
The ability to create share links for any file is also pretty cool.
By i336_ 8 years ago
Now that's what I call a shameless plug!
By ptrptr 8 years ago
We would definitely seriously consider switching to GCS more if your cloud functions were as powerful as AWS Lambda (trigger from an S3 event) and supported Python 3.6 with serious control over the environment.
By scrollaway 8 years ago
I keep telling people that in my view, Google Cloud is far superior to AWS from a technical standpoint. Most people don't believe me... Yet. I guess it will change soon.
By simonebrunozzi 8 years ago
I'm in the process of moving to GCS mostly based on how byzantine the AWS setup is. All kinds of crazy unintuitive configurations and permissions. In short, AWS makes me feel stupid.
By joshontheweb 8 years ago
As far as I understand the S3 API of Cloud Storage is meant as a temporary solution until a proper migration to Google's APIs.
The S3 keys it produces are tied to your developer account. This means that if someone gets the keys from your NAS, he will have access to all the Cloud Storage buckets you have access to (e.g your employer's).
I use Google Cloud but not Amazon. Once I wanted a S3 bucket to try with NextCloud (then OwnCloud). I was really frightened to produce a S3 key with my google developer account.
By andmarios 8 years ago
"fraction of the cost" - how do you figure? Or are you just saying from a cost-to-store perspective?
Your Egress prices are quite a bit more compared to CloudFront for sub 10TB (.12/GB vs .085/GB).
The track record of s3 outages vs time your up and sending Egress seems like S3 wins in cost. If all your worried about is cross region data storage, your probably a big player and have AWS enterprise agreement in place which offsets the cost of storage.
By rynop 8 years ago
So this is more compute related but do you know if there are any plans on supporting the equivalent of the webpagetest.org(WPT) private instance AMI on your platform?
Not only is webpagetest.org a google product but it's also much better suited for the minute by minute billing cycle of google cloud compute. For any team not needing to run hundreds of tests an hour the cost difference between running a WPT private instance on EC2 versus on google cloud compute could easily be in the thousands of dollars.
By Spunkie 8 years ago
Would use Google but I just can't give up access to China. Sad because I also sympathize with Google's position on China.
By malloryerik 8 years ago
boulous not in bad taste at all - happy google convert and gcs user works very well for us ymmv
By zoloateff 8 years ago
If you made a .NET library that allows easily connecting to both AWC and GCS by only changing the endpoint I would certainly use that library instead of Amazon's own.
Just saying, it gets you a foot in the door.
By DenisM 8 years ago
I had no idea this was an option. Great to know!
By danielvf 8 years ago
i have had problems integrating apache spark using google storage. especially because s3 is directly supported in spark.
if you are api compatible with s3, could you make it easy /possible to work with google storage inside spark?
remember i may or may not run my spark on Dataproc.
By sandGorgon 8 years ago
What is your NAS box doing with S3/GCS ?
By mbrumlow 8 years ago
S3 applications can use any object store if they use S3Proxy:
How about giving a timeline of when Australia will be launching? I see you're hiring staff, and have a "sometime 2017" goal on the site, but how about a date estimate? :)
By thejosh 8 years ago
Does GCS support events yet?
By philliphaydon 8 years ago
As Relay's chief competitor in this region, we of Windsong have benefited modestly from the overflow; however, until now we thought it inappropriate to propose a coordinated response to the problem.
By hyperpallium 8 years ago
What software are you using for your NAS box?
By espeed 8 years ago
Classy parley. I'll allow it.
By pmarreck 8 years ago
Competition is great for consumers!
By masterleep 8 years ago
S3 is currently (22:00 UTC) back up.
The timeline, as observed by Tarsnap:
First InternalError response from S3: 17:37:29
Last successful request: 17:37:32
S3 switches from 100% InternalError responses to 503 responses: 17:37:56
S3 switches from 503 responses back to InternalError responses: 20:34:36
First successful request: 20:35:50
Most GET requests succeeding: ~21:03
Most PUT requests succeeding: ~21:52
By cperciva 8 years ago
Thanks for taking the time to post a timeline from the perspective of an S3 customer. It will be interesting to see how this lines up against other customer timelines, or the AWS RFO.
By josephb 8 years ago
Playing the role of the front-ender who pretends to be full-stack if the money is right, can someone explain the switch from internal error to 503 and back? Is that just them pulling s3 down while they investigate?
By kaishiro 8 years ago
no. soundcloud uses aws s3. it is still down. this is false information.
By thenewregiment2 8 years ago
A piece of hard-earned advice: us-east-1 is the worst place to set up AWS services. You're signing up for the oldest hardware and the most frequent outages.
For legacy customers, it's hard to move regions, but in general, if you have the chance to choose a region other than us-east-1, do that. I had the chance to transition to us-west-2 about 18 months ago and in that time, there have been at least three us-east-1 outages that haven't affected me, counting today's S3 outage.
EDIT: ha, joke's on me. I'm starting to see S3 failures as they affect our CDN. Lovely :/
By gamache 8 years ago
Reminds me of an old joke: Why do we host on AWS? Because if it goes down then our customers are so busy worried about themselves being down that they don't even notice that we're down!
By traskjd 8 years ago
I'm getting the same outage in us-west-2 right now.
By xbryanx 8 years ago
My advice is: don't keep your eggs in one basket. AZs a localised redundancy, but as Cloud is cheap and plentiful, you should be using two or more regions, at least, to house your solution (if it's important to you.)
EDIT: less arrogant. I need a coffee.
By movedx 8 years ago
It shouldnt be technically possible to lose S3 on every region, how did amazon screw this up so bad?
By bischofs 8 years ago
Amen. We setup our company cloud 2 years ago in US-West-2 and have never looked back. No outage to date.
By twistedpair 8 years ago
Is us-east-2 (Ohio) any better (minus this aws-wide S3 issue)?
By compuguy 8 years ago
Probably valid, though in this case while us-west-1 is still serving my static websites, I can't push at all.
By jchmbrln 8 years ago
The s3 outage covered all regions.
By nola-radar 8 years ago
That's a really good point!
By notheguyouthink 8 years ago
I used to track DynamoDB issues and holy crap, AWS East had a 1-2 hour outage at least every 2 weeks. Never in any of the other regions. AWS East is THA WURST
By shirleman 8 years ago
The s3 outage covered all regions.
By nola-radar 8 years ago
Yup, same here. It has been a few minutes already. Wanna bet the green checkmark[1] will stay green until the incident is resolved?
In December 2015 I received an e-mail with the following subject line from AWS, around 4 am in the morning:
"Amazon EC2 Instance scheduled for retirement"
When I checked the logs it was clear the hardware failed 30 mins before they scheduled it for retirement. EC2 and root device data was gone. The e-mail also said "you may have already lost data".
So I know that Amazon schedules servers for retirement after they already failed, green check doesn't surprise me.
By emrekzd 8 years ago
It's crazy how much better the communication (including updates and status pages) is of the companies that rely on AWS than AWS' communication itself.
These service health boards are more like advertisement page then actual status of the service.
By tlogan 8 years ago
I'm seeing green checkmarks across the board, but they just added a notice to the top of the page:
> Increased Error Rates
> We are investigating increased error rates for Amazon S3 requests in the US-EAST-1 Region.
By hartleybrody 8 years ago
Well, at least our decision to split services has paid off. All of our web app infrastructure is on AWS, which is currently down, but our status page [0] is on Digital Ocean, so at least our customers can go see that we are down!
EDIT UPDATE: Well, I spoke too soon - even our status page is down now, but not sure if that is linked to the AWS issues, or simply the HN "hug of death" from this post! :)
EDIT UPDATE 2: Aaaaand, back up again. I think it just got a little hammered from HN traffic.
By cyberferret 8 years ago
FYI to S3 customers, per the SLA, most of us are eligible for a 10% credit for this billing period. But the burden is on the customer to provide incident logs and file a support ticket requesting said credit (it must be really challenging to programmatically identify outage coverage across customers /s)
The dashboard not changing color is related to S3 issue.
See the banner at the top of the dashboard for updates.
So it's not just a joke... S3 being down actually breaks its own status page!
By geerlingguy 8 years ago
Thank god I checked HN. I was driving myself crazy last half hour debugging a change to S3 uploads that I JUST pushed to production. Reminds me of the time my dad had an electrician come to work on something minor in his house. Suddenly power went out to the whole house, electrician couldn't figure out why for hours. Finally they realized this was the big east coast blackout!
By jliptzin 8 years ago
Corporate language is entertaining while we all pull out our hair.
"We are investigating increased error rates for Amazon S3" translates to "We are trying to figure out why our mission critical system for half the internet is completely down for most (including some of our biggest) customers."
I've been fuzzing S3 parameters last couple hours...
And now it's down.
By maxerickson 8 years ago
All: I hate to ask this, but HN's poor little single-core server process is getting hammered and steam is coming out its ears. If you don't plan to post anything, would you mind logging out? Then we can serve you from cache. Cached pages are updated frequently so you won't miss anything. And please do log back in later.
(Yes it sucks and yes we're working on fixing it. We hate slow software too!)
By dang 8 years ago
"I felt a great disturbance in the Force, as if millions of voices suddenly cried out in terror, and were suddenly silenced. I fear something terrible has happened."
By greenhathacker 8 years ago
Down for us as well. We have cloudfront in front of some of our s3 buckets and it is responding with
CloudFront is currently experiencing problems with requesting objects from Amazon S3.
Can I also say I am constantly disappointed by AWS's status page: https://status.aws.amazon.com/ it seems whenever there is an issue this takes a while to update. Sometimes all you see is a green checkmark with a tiny icon saying a note about some issue. Why not make it orange or something. Surely they must have some kind of external monitor on these things that could be integrated here?
edit: Since posting my comment they added a banner of
"Increased Error Rates
We are investigating increased error rates for Amazon S3 requests in the US-EAST-1 Region."
However S3 still shows green and "Service is operating normally"
By chrisan 8 years ago
Sysadmin: I can forgive outages, but falsely reporting 'up' when you're obviously down is a heinous transgression.
Somewhere a sysadmin is having to explain to a mildly technical manager that AWS services are down and affecting business critical services. That manager will be chewing out the tech because the status site shows everything is green. Dishonest metrics are worse than bad metrics for this exact reason.
Any sysadmin who wasn't born yesterday knows that service metrics are gamed relentlessly by providers. Bluntly there aren't many of us, and we talk. Message to all providers: sysadmins losing confidence in your outage reporting has a larger impact than you think. Because we will be the ones called to the carpet to explain why <services> are down when <provider> is lying about being up.
By johngalt 8 years ago
They don't show it on the status dashboard at https://status.aws.amazon.com/ (at least at the time I originally posted this comment).
Edit 2: And now the event disappeared from my personal health dashboard too. But we are still experiencing issues. WTH.
By jrs235 8 years ago
It's interesting to note the cascading effects. For example, I was immediately hit by three problems:
* Slack file sharing no longer works, hangs forever (no way to hide the permanently rolling progress bar except quitting)
* Github.com file uploads (e.g. dropping files into a Github issue) don't work.
* Imgur.com is completely down.
* Docker Hub seems to be unavailable. Can't pull/push images.
By atombender 8 years ago
what's truly incredible is that S3 has been offline for h̶a̶l̶f̶ ̶a̶n̶ ̶h̶o̶u̶r̶ two hours now and Amazon still has the audacity to put five shiny green checkmarks next to S3 on their service page.
they just now put up a box at the top saying "We are investigating increased error rates for Amazon S3 requests in the US-EAST-1 Region."
increased error rates? really?
Amazon, everything is on fire. you are not fooling anyone
It's not just us-east-1! They're being extremely dishonest with the green checkmarks. We can't even load the s3 console for other regions. I would post a screenshot, but Imgur is hosed by this too.
By STRML 8 years ago
Its unreal watching key web services fall like dominoes. Its too bad the concept of "too big to fail" applies only to large banks and countries.
By rrggrr 8 years ago
Thanks for sharing. I overheard someone on my team say that a production user is having problems with our service. The team checked AWS status, but only took notice of the green checkmarks.
Through some dumb luck (and desire to procrastinate a bit), I opened HN and, subsequently, the AWS status page and actually read the US-EAST-1 notification.
HN saves the day.
By mabramo 8 years ago
Wow, S3 is a much bigger single point of failure than I have imagined. Travis CI, Trello, Docker Hub, ...
I can't even install packages because the binary cache of NixOS is down. Love living in the cloud.
By rnhmjoj 8 years ago
Notice how Amazon.com itself is unaffected. They're a lot smarter than us.
By benwilber0 8 years ago
And they've just broken four-9's uptime (53 minutes). They must be pretty busy, since they still haven't bothered to acknowledge a problem publicly...
By bandrami 8 years ago
Best thing about incidents like these: post-mortems for systems of this scale are absolutely fascinating. Hopefully they publish one.
By obeattie 8 years ago
This seems like an appropriate time as any... Anyone want to list some competitors to S3? Bonus if it also provides a way to host a static website.
Apple's iCloud is having issues too, probably stemming from AWS. Ironically Apple's status page has been updated to reflect the issue while Amazon's page still shows all green. https://www.apple.com/support/systemstatus/
By valine 8 years ago
Wow this is a fun one. I almost pooped my pants when I saw all of our elastic beanstalk architecture disappear. It's so relieving to see it's not our fault and the internet feels our pain. We're in this together boys!
I'm curious how much $ this will lose today for the economy. :)
By boulos 8 years ago
By NiekvdMaas 8 years ago
By JPKab 8 years ago
By twakefield 8 years ago
By blantonl 8 years ago
By sachinag 8 years ago
By nodesocket 8 years ago
By i336_ 8 years ago
By ptrptr 8 years ago
By scrollaway 8 years ago
By simonebrunozzi 8 years ago
By joshontheweb 8 years ago
By andmarios 8 years ago
By rynop 8 years ago
By Spunkie 8 years ago
By malloryerik 8 years ago
By zoloateff 8 years ago
By DenisM 8 years ago
By danielvf 8 years ago
By sandGorgon 8 years ago
By mbrumlow 8 years ago
By gaul 8 years ago
By thejosh 8 years ago
By philliphaydon 8 years ago
By hyperpallium 8 years ago
By espeed 8 years ago
By pmarreck 8 years ago
By masterleep 8 years ago
By cperciva 8 years ago
By josephb 8 years ago
By kaishiro 8 years ago
By thenewregiment2 8 years ago
By gamache 8 years ago
By traskjd 8 years ago
By xbryanx 8 years ago
By movedx 8 years ago
By bischofs 8 years ago
By twistedpair 8 years ago
By compuguy 8 years ago
By jchmbrln 8 years ago
By nola-radar 8 years ago
By notheguyouthink 8 years ago
By shirleman 8 years ago
By nola-radar 8 years ago
By alexleclair 8 years ago
By nostromo 8 years ago
By emrekzd 8 years ago
By tuna-piano 8 years ago
By tlogan 8 years ago
By hartleybrody 8 years ago
By cyberferret 8 years ago
By gmisra 8 years ago
By geerlingguy 8 years ago
By jliptzin 8 years ago
By ethanpil 8 years ago
By maxerickson 8 years ago
By dang 8 years ago
By greenhathacker 8 years ago
By chrisan 8 years ago
By johngalt 8 years ago
By jrs235 8 years ago
By atombender 8 years ago
By fletom 8 years ago
By STRML 8 years ago
By rrggrr 8 years ago
By mabramo 8 years ago
By rnhmjoj 8 years ago
By benwilber0 8 years ago
By bandrami 8 years ago
By obeattie 8 years ago
By AndyKelley 8 years ago
By 140am 8 years ago
By mijustin 8 years ago
By ethanpil 8 years ago
By valine 8 years ago
By dfischer 8 years ago