Hacker News new | past | comments | ask | show | jobs | submit login
BitBucket was down (status.bitbucket.org)
68 points by Achshar on April 27, 2014 | hide | past | favorite | 52 comments




Looks like they're back in service - https://twitter.com/bitbucket/status/460429088717434882

Postmortem, please.....


The uptime for GitHub for 2013 was 99.69%, according to http://www.getapp.com/compare/source-code-management-softwar....

I can't find a stat for BitBucket, does anyone know?

I'm considering switiching to GitHub for a private repo I'm currently hosting on BB, due to downtime.


Bitbucket's downtime is pretty much the same as Github's.

I use it at work and my team pushes to it approximately 40-50 times a day. In the last year the total amount of downtime I've experienced has been 4-8 hours. I pointed out the lack of downtime history to one of the guys when they asked if we could move to stash because we experienced 15 minutes down time twice in two days. (Also should point out we've experienced 10-24 hours downtime on our internal Jenkins so far this year)

It's just when you do experience downtime it feels really bad.


I would also like to point out (as a heavy user of both platforms) that BitBucket has outages that do not 'register' as an outage.

There are issues like hanging on pull requests/merges at random or not loading the diff.

Frustrating.


GitHub has this issue, too. Try doing a PR on GH when the diff is reasonably large. We get 500s from them when trying to review anything worth doing. That and the inability to set permissions on branches have prevented us from moving from self-hosted Gitolite to GH full time.


You should try BitBucket, just too see how hard it bails on you.


Do they offer unlimited private repos now? That's why I choose BB over GitHub back then. I don't notice them going down often though.


Yeah BB do unlimited private... but GitHub don't. I would pay for GitHub, but unless it's got better uptime there is no point...



You can evuluate by looking at http://status.bitbucket.org/history seems like one outrage almost once a month.

I used to use BB a lot, almost daily, last year. I would occasionally hit one of those "ssh issue" or "outage". The most problematic is BB cannot close issues automatically via commit message on some repositories. Some. It's rather annoying.

If you consider using BB or Github for real product, I advise you host yourself a server running either Gitlab or SCM-Manager (supports git, mercurial, svn) and when you push you should push to your "local" server and the remote server. This way you can still do some remote work during downtime without sending patches around. This is an option if you truly need a backup plan...


I'm not sure if you're making a joke and/or english is not your first language, but, for the record, I think you're looking for the word "outage", not "outrage".


That's a typo. Thank you. And this has nothing to do whether English is my first or second or third language. I can be completely fluent in English and still have a typo.

So next time, just say "you meant outage" is good enough.


You misspelled the word consistently with each usage. I thought you were trying to be funny. So, it wasn't clearly a typo to two distinct readers.

PS - It looks like you edited the second outrage, but not the first.


It's a perfectly viable option to use both.

It makes sense to choose one as your primary to meet your workflows, but having both means work can continue when one has an outage.

You can even automate pushing/updating commits to whichever you choose to be the secondary.

(Or of course you can setup your own git server for a third backup in case the internet explodes).


On the status page, click on “Month”, at the right of “System Metrics”. It gives the uptime for the current month only, unfortunately. So for April, it was 99.816%.



While I don't have anything smart to say about stuff like this, I'd love to see their postmortem on this one. Would be somewhat hilarious if it's another bad configuration push across their infrastructure. Github has had so many of them and especially with the recent article of the stock brokerage firm going bankrupt due to bad configuration/code push.


A firm going bankrupt due to a bad code/config? Do you have a link to the article you're referring to? Sounds interesting...



Just the fact that they didn't have a way to record and verify whether the deployment was done properly boggles my mind. When I worked at a bank we had package management to do deployments, a separate tool for taking inventory of installed software (in case of users managing to sneak third party programs on to their system), and on top of that a web framework for tracking milestones during projects that allowed for manual entry by technicians and automated input from scripts so tasks that had to be done by hand like replacing hardware could be coordinated with build scripts and management could monitor the whole thing from a dashboard.


Wow! Bookmarking that one. What a great cautionary tale both for developers and devops. I may well need to use that as a teaching aide. Though a security principle, I cannot tell you how many times I have to point of the need for defense in depth in the design of software.


A reference to Knight Capital Group, most likely.


Maybe. They have been working on migrating some of the repositories over to new hardware lately.


I doubt they are pushing new configuration at 6am on a Sunday morning.


I doubted that Github would push new (bad) production code mid-day unannounced, and it still happened. To be fair, Github I think pushes new production code several times a day every day?


If you are pushing new configuration at 6am on a Sunday, and things immediately stop working, you revert the configuration change.


If you work in an environment where such rollbacks are that simple, you're in the rare minority. The reality of working with large-scale distributed systems is that rolling back becomes much more complicated. Push out new code and the accompanying DB schema change? Good luck rolling back to the older schema when you find the bug.


I don't think it's that simple, especially if the new bad configuration has already run amok with the machine/data. Backups are a thing, but at that scale, is probably not completely current every minute.


I do not know about that. In a lot of environments deployments on the weekend are common, even in full devops/automatic deploy situations.


So how much does this affect work flow for Git? Doesn't seem like that much of a big deal for small downtime unless I'm hosting a public page or open source project. I, personally, use Bitbucket for the unlimited private repos, makes development easy for me so downtime doesn't affect my team that much since we can pull from each other still.


How do you pull from your team members when hosting site is down?


If you have ssh access to their machines, you can just set up their repo as a new remote.

If not, "git bundle" supports coordination via email, which is the workflow that git was originally designed to support.


You can at the very least send each other a zip file containing the .git directory. When you extract that directory into a separate directory, you can add it as a remote and fetch/pull from it (because that's the entire repository right there).

What I usually do is I copy the .git to another name before running zip, so it doesn't extract as .git :)

This is also one of the reasons you shouldn't commit binaries, especially large ones.


For Mercurial, you'll need to run "hg serve" to have a lightweight Mercurial server up and running, so other developers can pull from you just like from any other repository:

    hg pull -r deadbeef http://developer-01:8000/
Remember, we're in DVCS land here.


One of the the nice things about a DVCS is that you can do something like work with as many copies as you need - there's no downside to pushing a copy out to a spare server where everyone can access via SSH until the main repo is back online.


git remote set-url origin link I set a new remote which would be a server we host that is also running git, or one of our personal computers will act as the remote until Bitbucket is back up. the thing is, the service hasn't been down for long periods of time, so the normal mode of operation is to push locally until we need to push to remote, which Bitbucket usually is up.


in my personal experience bitbucket is much reliable than github, I don't know its uptime though..


...much [more/less] reliable...

you forgot a very important word


What other factor are you considering for reliability if you don't know uptime?


It's back online now.


Why are they not hosting it on the status.bitbucket.org server? Tt have never been down as far as I know!


status.bitbucket.org is hosted on Statuspage.io.

It would be rather silly to host your status page on your main production infrastructure since if it goes down so does your status page.

Personally unless you're a major player I think you should always outsource your status page.


Surely its not about outsourcing a status page but just making sure that it is on completely different infrastructure?


Outsourcing allows you to isolate the status page from your human infrastructure. If your team makes a bad decision that leads to an outage then they can make the same bad decision about your status page if they are in charge of that as well. This provides a form of human fault tolerance.


I agree that insulating against human fault is always a good thing to do but at some point you must trust your team to do the right thing. In fact in using a 3rd party you are trusting that said 3rd party also has a form of human fault tolerance.

That said I cannot imagine even the most incompetent team making a change to the status page, which I would always host on completely separate infrastructure, at the same time as the rest of production infrastructure.


Then why don't they host their main servers on statuspage.io? I've never seen a status page that is down.


Yet another reason to self-host GitLab[1] or HgLab[2].

[1]: https://www.gitlab.com/ [2]: http://hglabhq.com/


Then, when it goes down you gave to deal with it instead of someone else!

There's usually a curve where it makes sense to invest in self-host, but the Bit bucket Slate has been really great.


I just sent a push to BitBucket, and it seemed to be taking a while, so I figures I'd just grab a quick headline here, and the top story is BitBucket being down. So my timing is good?


Is it just slow for you? What connectivity are you using? For me it's not working at all. SSH says no suitable response from server. I had to merge a branch :(


They had warned that there were updates happening last week; I think this is the after-effect of that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: