Our take on building your comms system internally

If you're a PM/engineer who ever wanted to create your own notification system or are in the process of building one, or even if you’ve finished building one, this article is for you. Building a notification system may seem simple when you're dealing with just a single email or SMS, but as the as the number of channels increase/ providers increase and the need to send transactional communications at scale arises, challenges start to emerge.

In this blog, we'll explore the intricacies of building a comms engine in-house, no matter how big or small your business is.

Difficulties businesses face while building an internal comms system

What started as a simple API call to one vendor now has to scale into a system that has cross & omnichannel routing abilities, failover protocols, an understanding of user preferences, and so much more. We delve into deeper details on what goes on in orgs that build.

Hit us up if you relate to 1 or more of these below!

  1. Communication data in silos: PMs often face the challenge of manually downloading reports from various vendors and consolidating the data, which is a highly time-consuming process. For instance, vendors like Amazon SES only provide generalized analyses without user-level data, which may not meet the in-depth requirements of businesses. Every vendor also has a different format of reporting, which means that the PM/analyst or engineer involved ha to manually consolidate reports across channels, making it difficult to obtain consolidated reports on a regular basis, such as monthly or weekly summaries.
  2. Time-consuming onboarding and integration process: Even if it's a one-time effort, integrating each vendor into the system requires valuable time and effort from the tech team. This involves reviewing additional documentation, implementing new SDKs, and in case of vendor failures, restarting the onboarding process from scratch.
  3. Undue pressure on engineering teams: As product teams expand and businesses grow, the responsibility of managing notifications primariy falls on the tech team because that’s who handles an internal comms system. This places undue pressure on engineers who now have to liaise with multiple product teams for diverse notification needs, diverting their attention from their core tasks.
  4. Increased costs and the need for extra engineering talent: Building a scalable notifications infra needs expertise, and you need to hire specialists to build and manage it. Like what LinkedIn did when they hired almost 60-70 new members to manage their in-house comms system. Hiring additional engineering talent becomes a necessity to ensure reliable notification delivery, especially for critical transactional notifications that cannot afford failures. However, this leads to a significant increase in costs as the system requires ongoing maintenance and support.
  5. Tedious and lengthy process for minor changes: Even making minor adjustments, such as changing a greeting from "hello" to "hi," can be a time-consuming and arduous process within the notification system, slowing down efficiency greatly in the long run.
  6. Lack of organization and dynamic content management: Content management for notifications often becomes disorganized and scattered across different platforms, such as spreadsheets and email threads. This lack of organisation creates chaos, particularly because notifications are not prioritized, making it challenging during employee churn or replacements. It also increases the dependency or concentration of risks on one employee, adding to the pressure.
  7. Breaking at scale: As the business scales and the volume of notifications increases, it becomes crucial to have a system that can handle the growing demands without compromising performance or user experience. Scaling a notification system requires careful planning and infrastructure to support the expanding user base.
  8. Lack of versioning control: Testing between live and test environments becomes tougher as the dependency on engineering goes up by many notches.
  9. Reliability falls low: Every PM/engineer strives for 100% deliverability of their notifications. When it comes to the case of transactional notifications, it’s very crucial that all messages get delivered, as some of them could be extremely critical and needs to be delivered in the nick of time. The reliability of your notification system is critical because when communication breaks down, you risk losing valuable customers that you have invested time and money to acquire. A reliable system ensures that notifications are delivered consistently and on time, minimizing the chances of customer churn.
  10. Go-to-market delays: Inefficient notification systems can lead to delays in launching new channels or expanding into new markets. When notifications are not streamlined, your team's focus is diverted from core product development to managing notification-related issues, causing delays in implementing new channels and strategies. This, in turn, also affects the core product you’re building, as your resources are reallocated to other tasks.
  11. Troubleshooting complexity: Managing multiple platforms for notifications can make troubleshooting and issue resolution a time-consuming and challenging process. With numerous platforms to handle, tracing the root cause of an issue can take hours or even longer, leading to productivity loss and frustration for your team.
  12. Compliance and privacy: It’s crucial to prioritize compliance with relevant regulations, such as data protection and privacy laws (e.g., SOC2, HIPAA). Ensuring that PII is handled securely and obtaining proper consent for communication is essential for maintaining trust and avoiding legal complications. This also requires some experience and assistance to make it a super smooth process to follow. A business will need to deploy resources to get this done in the proper way to avoid future complications. It requires extensive planning around data encryption, access controls (”who” will have, “how” much access to “which” data), authorization protocols and more!
  13. Difficulty in automation and building a workflow builder: When building internal tools, you often don’t think of building it as DIY for teams. Implementing automation and workflow management can be very time-consuming for businesses to implement internally. Creating logical workflows, managing templates for different channels, automating manual tasks like setting up failovers, or creating smart routing flows takes hours of coding effort. This includes setting up triggers, scheduling, and establishing rules for automated notifications, reducing manual efforts and ensuring timely and consistent delivery. To do this for every channel wouldn’t be possible for the tech team, whose main job isn’t notifications. When you also have to build complex business logic into your infrastructure, it adds yet another layer of complexity. Internally built tools often lack the flexibility of reuse/ DIY construct.
  14. Multiple sprint planning requirements: To successfully implement a notification system that works, your team will need to allocate multiple sprints in order to plan for it in advance and also be ready for any contingencies.
  15. Time-consuming approval cycles: Monitoring every channel and also managing third-party providers takes up a lot of time and resources. Apart from that, waiting for approvals from top management, like CEOs, marketing heads etc, on the notification language could add to the ticking clock.
  16. Collaboration with every vendor individually: A third-party service could provide delivery status, but your end-user has not received the message. The troubleshooting then falls on the PM or engineer to track down the issue with the individual vendor and then troubleshoot. This not even gives you a dissatisfied customer but also a frustrating process of going through all the logs manually to sort out the issue.
  17. What the biggies did: Huge businesses like LinkedIn, Uber, etc, have built a notification system at scale. Here’s what they put into it, and unless you have resources as large as these, it could very easily break, and your business could suffer as a result.
  • Really large team of 60-70 engineers who only worked on a comms system exclusively
  • A unified dashboard that the entire team can access at any time
  • Observability tools that can detect errors
  • The scale to handle millions of notifs

It’s not an easy feat to build a communications system internally. As we’ve expressed above, a lot of hurdles need to be crossed, and it just doesn’t stop there because it also involves constant upkeep and maintenance.

In the debate of Build vs Buy, we vote for the latter.