We started Shopflo in December 2021, and there were several important decisions we had to take in the beginning to set the base right. In hindsight, those decisions seem to be obvious and straightforward but back then we spent hours and days researching, conversing and comparing to come to these conclusions. This post should help anyone who is looking to start a tech company. As tech leaders, we have to take decisions that allow us to build fast and scale seamlessly.
Choosing the stack:
There is no one size fits all answer to this. Choosing the right stack for your team is basically like choosing the right car in a racing game. You choose according to your expertise and skill set.
We started with Python Django but eventually decided to keep all possible stacks in which we have the expertise to maintain the best possible practices. Currently, we are working with Python, Nodejs, and Java Springboot. This helps us in circulating and choosing whichever works best for a particular use case. Another added advantage is it makes hiring easier as it increases the top of the funnel for candidates.
One thing to avoid while choosing a stack is to not go for something which is not mature enough or is losing popularity in open source circles.
We use ReactJS for our frontend end which is very popular and mature.
Monolith vs Microservices:
For a startup launching an MVP as quickly as possible, Monolith seems to be a straightforward choice. But we had our concerns around a few modules like payments, login, and catalog as that could become a bottleneck at scale. To overcome this issue, we have structured our different modules in such a manner inside our monolith that they can be broken down into different services fairly quickly.
Cloud:
Using AWS for our cloud service needs was a no-brainer. Thanks to Tiger Global we got 100k credits from AWS Activate.
AWS:
- Elastic Container Service: With a small team and limited bandwidth, we realized we’ll need a lot of upskilling or dedicated DevOps to use Kubernetes (EKS).While we were initially unsure about ECS, we learnt through our network how a few companies were effectively using ECS at a very high scale. That gave us the confidence to move ahead with ECS. With two days of upskilling, we were able to deploy ECS services without any DevOps. The services have an autoscaling setup and thus are highly scalable.
- RDS: I don’t think you should think twice before choosing RDS :)
- API Gateway: All our services are located in a private subnet and can only be accessed via API Gateway. For authorization, we have put Lambda Authorizers. This helps us to avoid Authorization logic in every service of ours. This allows us to sleep peacefully at night knowing that only intended resources will be exposed to the public.
- Elastic Cache: Based on Redis, the Defacto caching solution.
- Dynamo DB: Along with all its magnificence, one good feature of dynamo DB is its TTL which allows you to run delayed tasks that are not timing critical. Its upsides and downsides are a story for another time.
- Others: Lambda, SNS :Do not be afraid of using cloud-specific managed services in our case we use ECS, Lambda, SNS, Dynamo DB, and a few more. This helps us save a lot of engineering bandwidth.
Google Cloud/Firebase:
Google Cloud has been helpful for specific things and smaller projects. Through this, we learned a few things about it:
- Hosting: We hosted our UI on firebase as it was very easy to set up and use. But we eventually realized firebase hosting load times are slow specifically in the India region. Due to this, we’ll be moving out to Cloudfare pages soon.
- Delayed queues: One of the best things about Google Cloud is Cloud Task Queues which provides delayed queues functionality out of the box and works like a charm if you need an exact delay in your task executions. We did not find anything similar in AWS out of the box.
- Authentication: After evaluating AWS Cognito, Oauth0, and firebase, Firebase Authentication turned out to be the simplest solution for our Authentication needs. It is very simple to get started with and works well but there are a couple of issues we faced as we scaled. More on this later.
We are exploring Ory to replace firebase authentication.
Sidenote: Different AWS account for each environment.
We learned the hard way how it’s essential to use different AWS accounts for each environment. We were initially working on a single account for all our environments with each environment in a different VPC and we were pretty pleased with our decision in terms of isolation and security.
During one of our testing events, we tested the scaling capability of our new service for which we started load testing on a staging environment. It was going fine until we realized our production environment had gone down as our staging environment used all the quota of a particular AWS service.
It’s extremely important that all environments (or at least production) should have a different account. AWS applies service quotes across your account and if you use these quotas in one environment, the other environment can go down.
(the overhead of separating out environments into different accounts is going to be significant so it’s very very important for you to do this from day 0 itself)
Google Cloud keeps quotas primarily on the basis of projects so this is not a problem in google cloud
Tools you cannot Ignore:
- Google Workspace: While it seemed to be a default option for organization email and was an easy setup, we recently came across issues while setting up SSO. Google SAML doesn’t work well with AWS SSO. One good alternative to Google Workspace is Microsoft 365 which is cheaper as well. But moving to Microsoft is a pain now
- Github: We use GitHub for version control and we can not recommend this tool enough. All our deployment pipelines are on Github Actions.In case you’re hesitant about Github because of cost, Bitbucket is a cost-efficient alternative. We didn’t have to worry about the costs as we got credits, thanks to Microsoft for startups. Also, migrating from Github to Bitbucket is smooth enough in case we decide to switch later.
- Cloudflare: Cloudflare is one of the most recommended tools used by DevOps professionals because of its DNS hosting and WAF capabilities.We have started doubling down on Cloudflare after seeing amazing UI performance improvements using Cloudflare Pages. Cloudflare ZTPA is also a great and cost-effective solution to OpenVPN and we’ll be moving to it soon.
- Jira: For our project management needs, we initially started with ClickUp and avoided Jira solely because the product looked boring. After wasting more than a month and a few hundred dollars trying to figure out Clickup, we finally caved to Jira and it’s been a happy shift. It’s boring but it’s straightforward and has all the integrations we need. Efficiency > Boredom, always.
- Slack: This is a necessity. Seamless conversations, keeping in touch with all departments, multiple channels, and having the space to “slack” whenever needed.