paint-brush
How to handle your startup data like a big techby@dmitriinabirukhin
277 reads

How to handle your startup data like a big tech

by Dmitrii NabirukhinJuly 28th, 2023
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

Learn about core principles in data management that all big tech companies adhere to can and should be adopted by startups
featured image - How to handle your startup data like a big tech
Dmitrii Nabirukhin HackerNoon profile picture


Quality and usability of a data stack might not be what a startup founder had in mind at the beginning of their business journey. But let’s be clear: when you’re overwhelmed by data, when teams lack the tools to extract the necessary data to make informed decisions, and investors can’t get a clear picture of your business health, it’s already too late.


The common misconception is that data is complex and abstract, and only corporations need to deal with it. Yet, every tech startup creates a massive amount of data with each click, transaction, and interaction. This data ranges from payment transactions, client databases, and marketing pipelines to employee birthdays. As companies grow and scale, they produce even more data that requires new tools and professionals to manage it. Interestingly, big tech companies work with data warehouses, while startups may only use Excel sheets shared on WhatsApp; still, at their core, both corporations and startups handle data in similar ways. The question then is: how do startups start doing the same?


From managing a data warehouse at a tech corporation with over ten thousand employees and working on my own startups, I've learned that regardless of the volume of data and business requirements, startups need to think big from the start. Still, there is no universal solution for every business case, but the core principles that all big tech companies adhere to can and should be adopted by startups. Doing so will cultivate a scalable data culture at every stage of business growth.

What data do we have and who is responsible for it?

Supporting a culture of data responsibility from the beginning is crucial. Even though data is often interdepartmental and responsibilities tend to fluctuate in startups, it’s essential to understand where data is stored and who owns it. The owner of these datasets should be responsible for their storage, usage, and quality. For instance, in a situation where a product manager calculated the unit economics, but a finance team doesn’t have access to the expenses and a sales team insists that more sales were made, the role of the data owner is to create a clear path to resolve the problem.


A common misconception is that tech teams should be entirely responsible for data. This creates situations when data is led by technology requirements, rather than business needs. When tech teams prioritise technical aspects like data storage, security, and infrastructure, they might not fully grasp the true needs and objectives of the business. As a result, the data analysis and interpretation may become limited to what the technology allows, rather than what the business truly requires. This is not the path to becoming a data-driven company. It might sound counterintuitive, as all the data lives inside the data stack, but business teams should be data owners and stewards. At the end of the day, the point of data management is serving business needs.


The easiest way to start is to divide your data between business functions, such as sales (CRM, marketing data), product (unit economics, backlogs, road maps), and finance (financial reports, budgeting), etc. Then, assign a data-owner from each business function for each dataset. To prevent these data responsibility roles from getting lost during rapid growth and scaling, make sure to move the roles accordingly.

How do we store the data?

The main rule here is if you don’t keep all the data in corporate software, you don’t have the data at all. As a simple example, when a client is ready to buy a product with a 30% discount that was discussed with your sales manager in a private message, who is now on vacation, and there’s no information in a CRM system, you might face some issues. Ensure that all important data is stored and accessible for the right team members.


First, set up shared file storage to support collaboration among team members. Cloud storage might be your best option, as it's an accessible, scalable, and usually cost-effective solution for managing your data. Even when you introduce new software and tools, such as document management and content management systems, CRM, and ERP, cloud storage will remain a place where team members can store data like reports, presentations, or documentation.


Keep both the raw data (data that has not been processed for use) and processed data — it will help to avoid situations where something went wrong and a team needs to re-do the analytics or sync data between departments.


Remember, data can be qualitative but not always usable. Corporations typically adopt a ‘store it all’ approach to data storage, but for startups where resources are limited, this task can be daunting. It is hard to predict what data might be of any use in the future, but thoughtful planning and looking into best market practices might provide some insights. If you find your startup is struggling with storage costs, consider introducing a recurring data demand review. Over time, it might turn out that some data that was considered useful is not anymore. You can simply start by checking data that wasn’t used for many months, but remember to align any decision to delete or archive data with the respective owners. If it doesn’t help, it might be a good time to seek a better tool with superior data compression and tailored pricing options.

How do we secure the data?

Data security is undoubtedly one of the most important goals for any tech startup, as well as the hardest one to achieve. Just imagine that a finance team member stores reports on their personal computer, and a sales intern has access to the full clients' database from day one — sounds like a recipe for disaster. While full data security is mostly an unrealistic objective, implementing a company culture with data security at its core could potentially minimise major data leaks or damage.


Once you have chosen your data storage software, and data owners and stewards have been assigned, it’s now time to discuss who sees what and how to get access to data. Choose a team member responsible for assigning data access. For small startups, it is probably going to be a founder, but as the company scales, a decentralised approach may be more suitable. It’s easier to maintain access permissions when they are granted to roles rather than team members, which makes it easier to transfer access between employees in case of any changes.


Moreover, the data owners should know the platforms' backup policies by heart. For the most sensitive data, a manual backup must be applied. The restoration of your company's customer database after a website breach could depend entirely on a single checkmark in the site settings.

How do we keep the data organised?

It’s likely that, at some point, data owners will be busy with more pressing matters. Their focus will move away from the data stack, leaving it without any support. That’s why at the very beginning, it’s essential to adopt the right approach to data management. Otherwise, while scaling and growing, a startup could be overwhelmed by data, and produce duplicate reports and analytics based on incorrect data. The more unmanaged data you have, the less usable it becomes.


Prioritising is key here. At every stage of a company’s development, choose the right approach to data governance that is aligned with your current needs and capabilities. Data governance is a collection of policies, processes, roles, metrics, and standards for effective and efficient use of information.


Consider Introducing a data steward role in your company. At the very beginning, it’s probably going to be a founder or CEO, who will set up data governance and oversee the beginning of cross-functional implementation. However, shortly after, it’s better to appoint a dedicated team member to facilitate its development, such as providing data owners with guidance for data storage, security, management, and best practices exchange. Make sure that the policies are followed by them. Otherwise, investigate the cause and make the required changes.

How do we know where to find the data and what does it actually mean?

As the volume of your data increases, so does the time spent searching for the right data. For example, when a commercial team needs to report on a company’s monthly results in retrospect to the last two years, it might be a stressful task to look through all the data, if it wasn’t in order at the beginning. That’s where a knowledge base will be of help. Even though documentation might seem overwhelming and unnecessary when everyone is juggling numerous urgent tasks, once done, it will be a significant time-saver in the future.


Corporations usually use a data catalogue software, a kind of data library where data is well-organised and securely stored. For startups, the solution will probably be extensive — any knowledge management software, such as Confluence or Notion, will be more than enough.


Start with writing down what software you use, what data you have, who is responsible for it, how to get access etc. Next, create a data analytics glossary that will help the team to speak the same language and work on cross-functional tasks more effectively. Make sure the terminology is also used in file naming and routes to them; these small implementations will sufficiently reduce time spent on searching and exchanging files, as well as on the documentation itself.


Assess the complexity of documentation based on the business structure, size, and amount of data possessed. For instance, a glossary listing the data collected and the most crucial reports with their data owners would suffice for small low-tech businesses. For high-tech startups, data flow, entity relationship diagrams, deployment diagrams, and metadata layers can be beneficial for ensuring data quality and usability.


A good practice is to create manuals for reports and analytics. This could be step-by-step guides with examples or video recordings on how to create them.


Pay particular attention to the alignment between departments in terms of reports. In case the operational team’s members approach the finance team and ask for the same numbers but in different formats and variations, it could lead to a mess inside both departments as well as unnecessary time consumption. This issue can easily be avoided by agreements between departments on what type and format of data should be provided, timeframes, and responsibility for the delivery team members.

How do we handle the data?

By fostering a data-driven culture within a startup, you create an understanding for every team member that data analytics is a task that should be taken seriously. Data governance must become an essential element of business, from choosing a data analytics tool to establishing file naming principles.


Let’s consider some more useful tips and tricks for how to handle the data better:


  1. Documentation. If there’s a pattern of repeatedly emerging questions or situations, such as team members wondering where reports live or how to calculate a newly introduced metric — document it. It’s better to address such questions right away than find out later that your company uses at least two different formulas to calculate the LTV of its customer.
  2. Reports prototyping. When creating an extended report, start with its prototype: discuss the essential requirements, do a quick analysis, approve a layout and design, and then move to the report creation. This process will help to avoid situations where a report that took a month to create turns out to be unacceptable.
  3. Briefing. Always brief before data processing. When creating a report using data you’re not an owner of, make sure you interpret it correctly.
  4. Peer review. Cultivate a culture of peer reviews to validate data and its usability. Encourage asking questions before doing anything and align priorities with the overall business goals rather than individual or department goals.’


For any tech startup, the data stack is an essential part of the business. While some big tech practices might look unnecessary and even harmful (for example, if the growth speed is more important than the long-term perspective.) But fostering a data-driven culture from the very beginning can pave the way to success and help turn your startup into a fast-scaling company. Just remember: it’s a lifelong journey, and following the Pareto principle will help you on your way to becoming a truly data-driven company.