Building a global data and technology platform to enable transformational value-creation across real estate

Martin Samworth, CEO, RE5Q and Seth Rogers, CTO, RE5Q in conversation

Martin Samworth, the former Chairman of CBRE’s advisory business in APAC and EMEA, was recently appointed as Chief Executive of RE5Q, the real estate AI and technology business, forming part of the Reech Corporations Group.  

Martin knows the global property business inside out. In over 35 years at CBRE, Martin worked internationally across all real estate sectors at the highest levels with developers, investors and occupiers. He has also long been a significant advocate for change, evolution and digital enablement across the industry and has championed many proptech and other innovation programmes in-house and working with outside partners to try to move forward what can often still feel like a largely undigitised sector. He will take all those insights, experiences and relationships into leading RE5Q into the next phase of its growth.  

In taking on the CEO role, Martin was particularly excited about the potential he sees for RE5Q to be a genuine long-term enabler of transformational change across the real estate sector. He also feels the strength of the data and tech capability which RE5Q has built and continues to build out remains a bit of an untold story. He is keen to change that. 

Martin sat down with Seth Rogers to shine some light on RE5Q’s strength and potential. Seth is RE5Q’s CTO and an internationally recognised expert in applied AI and Big Data, whose CV includes being a member of the team that brought the world Google Maps and a senior design lead at Facebook. What follows is our transcript of their wide-ranging discussion across current state innovation, the success factors which lie behind being truly transformative, how that gets applied in real-world use cases, and a crystal-ball-view on where we might be in 3-5 years’ time.  

Martin Samworth [MS]: Good morning Seth. We’ve known each other for years, but it’s good to be sitting down with you now as RE5Q’s new CEO. And it’s also great to have the opportunity to ask youa few “under-the-the bonnet” questions around the tech-enabled change that I have  seen gathering pace in real estate over the past few years. I see RE5Q as one of the real leaders here.  

Seth Rogers [SR]: Thanks Martin and it’s great to have you on board. 

Undigitised current state, ESG and the opportunity for transformational change 

MS: So kicking off, one of the things I think is probably under-estimated, even by people within the real estate business, is the scale and the pace of that change, I just mentioned. Can you help us get our arms around what is really happening out there in terms of current state innovation and the art of the possible?  

SR: Sure. So, I think we probably need to begin with a candid assessment of where real estate is, currently. 

Clearly there are some pockets where things are different, but I think if you look across the sector, it’s still largely characterised by manual processes and old tech infrastructures and tools. A lot of spreadsheets, for example! 

Most buildings have a few sensors, like smoke alarms and thermostats, but these tend to be hardwired into dedicated systems, so their data is siloed and relatively static. Around 70-90% of the world’s data is locked away in legacy paper and digital format such as PDFs, documents and old databases and this very much characterises real estate which has vast stores of unstructured data, too costly to process manually. Compared to other environments, like tech, real estate looks and feels pretty opaque and not at all open-source. And what that tends to mean is that you’ve got people making judgements about the built environment on pretty imperfect information and creating layers of informational inefficiencies and asymmetries. That can and will change.  

And I guess, importantly and pressingly, real estate’s environmental credentials are weak. Buildings use more than 40% of the world’s energy and generate more than 36% of global CO2 emissions. But there is a massive transition in progress here. Around 95% of all new generating capacity commissioned in 2020 came from renewables. Unlike fossil fuel generation, we can’t directly control renewable sources like the wind or the sun, so renewables energy is inherently more dynamic, both in terms of energy generation and pricing. Developing and installing smarter, more dynamic AI-driven energy management systems are going to be essential to manage this. That’s a big lift. On the opportunity-side, those dynamics are also going to create opportunities for large building owners and managers to generate revenues from curtailment/demand response. Increasingly — and the RoI on this is rapidly stacking up — we are likely to see virtual power plants and storage capacity built in the bigger new developments, turning overhead into incremental revenue generation. 

MS: We’ll come back to those environmental issues which are so dominant in the conversation I have with peers right now. But sticking with the data point, do you think that is changing, though Seth? For example, I’ve started to see more governments move more towards Open Data. 

More data than ever on the built environment 

SR: That is absolutely happening Martin. And there is political and regulatory momentum behind that, with governments beginning to embrace open data, given the significant, beneficial economic activity that empowers. So, in the UK for example, you’re seeing the likes of Tim Berners-Lee and the Open Data Institute which he co-founded, spearheading moves to improve transparency, access and reusability and the Ordinance Survey and Geospatial Commission collaborating in areas such as identification numbers for buildings. 

And, as a result, there are literally petabytes of publicly available data on the built environment being released by authorities across the world every day. And just to scale that, one petabyte of 3MB standard print photos placed side by side would be over 48,000 miles long. Almost long enough to wrap around the equator, twice. 

Capturing this and other data, extracting meaning and signals that we can combine into clear, accurate, timely insights is absolutely at the heart of what we do at RE5Q. We are doing this at a scale and with the kind of processing smarts and power you are much likely to associate with Big Tech rather than say the IT department of a real estate company. 

The power to compute

MS: So if the built environment is really one of THE big data challenges,  it must take a phenomenal amount of computing power to bring that all together and generate actionable insights from it. In reasonably lay-person terms, can you give us some sense of the tech power you need to have brought together to create those insights?

SR: Good question. It probably makes sense to think about some of the challenges here and what we are doing at RE5Q to respond. 

Data Storage 

So, first off, as we’ve touched on, real estate data is massive and growing exponentially, consistent with the consensus view that over 90% of all data has been generated in the last few years. 

To be in this game you therefore need to have world class storage to store and then digest petabytes of data. At RE5Q, we use the same storage engine used by European research agency CERN for the Large Hadron Collider which generates around 90 petabytes of data annually from experiments on the world’s largest and most powerful particle accelerator. This means we can bulk import multi-level data at a prestigious rate. Actual numbers are commercially sensitive, but let’s just say, many petabytes per month might not be unheard of. This gives us a tremendous amount of optionality. We use cloud as well, but used in isolation, cloud can create constraints in terms of capacity and cost.

Data Onboarding

And the on-boarding of that data needs to be smart, efficient and very thought through. “Data is the new oil” is not really a cliché to which I subscribe. It’s much more “data is the new nuclear fuel” — incredibly powerful but needs to be handled with immense care and precision. 

So, some key features here to illustrate:

  • We’ve built a flexible data factory to support and house hundreds of thousands of data sources, automatically structured and linked both logically and geospatially, which you can easily add to and then publish from;
  • We installed data firewall and quality pipelines to mitigate poor consistency across data sources;
  • All our data sources are traced to source – we don’t tolerate raw/unclear origin data;
  • All data rights are managed from source; 
  • Licence compliance and policy enforcement is executed by machine —  no gap between policy and practice;
  • Our data is multi model/vendor, so if there is content we have, but are unable to redistribute due to licence restrictions, we can switch to sources that permit redistribution if available, or offer licence upgrades via micropayments or site licencing; and
  • All user data is GDPR compliant. User data is partitioned by geography so citizen data never leaves their home jurisdiction.

MS: Can you put some numbers and examples around RE5Q’s onboarding for us? 

SR: So RE5Q processes hundreds of thousands of data sources from around the world 24/7, collating petabytes of data we just talked about, in over 240 languages. We now have data on over 20 million UK properties and 100% of Swiss addresses and have started data onboarding in the US and Ireland. 

Our data categories are deep and wide — Company, Property, Legal, Planning and ESG data. So to bring that to life on the legal side, in Ireland, we onboarded all the legal data from the courts in a couple of weeks. We then set our proprietary AIs to work to find the relationships between the parties in each legal case, outcome and judgment. This, for example, allows users to surface how judges ruled historically on specific types of court case, relating to, say, certain types of property dispute or which senior executives were involved in relevant cases.

AI and Data lakes 

MS: Thanks, Seth. And talking more broadly about data lakes, can you expand on what RE5Q is doing with these in a way that sets it apart?

SR: So first-off you need to understand that managing data lakes, keeping them clean and as productive as you want them to be, is potentially a highly manual, super-expensive task. 

Self-learning data governance

At RE5Q, we’ve built highly advanced and self-learning data governance system, so that our data quality, correction, validation and curation are 100% AI-powered and scalable to hundreds of thousands of data sources, without ANY database changes or human input. And the self-learning part there is important — our AI is geared to generating new datasets from existing data sources, so you get this multiplier effect on the data pools.

We’ve spoken to some of the market leaders across the real estate sector and we don’t believe anyone else is close to us in terms of the depth and breadth of these capabilities.

Surfacing insight 

Second, clearly, we deploy a whole battery of machine-learning algorithms to surface the insights from the data lakes. 

Our data lake technology compares very favourably versus for example, legacy cloud offerings and combined with our compute, virtualisation and networking technology means we can move data and train AIs in minutes not hours.

Lifting the bonnet on part of that, storing complex data and extracting meaning is only possible with a diverse and sound network of mathematical models. 

If you’ll allow me to get technical for a moment, the models supported by RE5Q range from classical to quantum. Current implementations include Quaternion, Statistical/Probabilistic, n-dimensional vectors and similarity models. In addition, we use Hilbert space models and non-Euclidian topological approaches in many areas from Automated Valuation Models (“AVM”) recombination to Isochrone Maps. And in Virtual Reality generation we commonly deploy for example, 5 dimensional models (to overcome the problems of a 3D model e.g., Gimbal Lock).

Thirdly, we surface that insight in a way which is user-friendly – so just like a Google search, but richer, deeper and more expertly focused on real estate.

ESG and AVMs

MS: Thanks Seth. Can you give us some more real-world applications, bringing some of that “edge” to life. Let’s start with ESG – as I noted before, absolutely front and centre in the conversations I have with other senior leaders across the industry.

SR: Sure 

ESG and “triangulation”

One thing that is becoming very clear for ESG is that there is no single source of truth. But currently, there are a lot of competing standards and imperfect vendor data-sets. You need to be able to bring together a range of conventional, alternative and adjacent data sources and then triangulate an optimised end position from all of that in order to generate a truer view. It’s what data engineers call a cross-cutting concern and it’s a good example of a problem that AI and Big Data can really help with. It also goes back to the points I was making earlier on how we’ve designed and built our systems – precisely to be versatile and powerful enough to solve for these challenges. 

So let’s make this real. Recently, an asset management firm in Germany came to us as they were keen to determine how much waste was recycled at specific buildings, with as much precision as possible, but without the time and cost of a laborious manual checking exercise. 

RE5Q have the location of all landfill sites and recycling centres globally in its data lakes, and, relevant to this case, applicable German recycling collection schedules and the geospatial data and mapping information of the specific buildings in question. By triangulating those sources, we were able to produce specific building level recycling metrics. This is one example, but the wider use case is evident.

It’s worth noting that we generate our own proprietary geospatial data and mapping information – again we triangulate there using satellite imaging as well as multiple public registers and, as in the case of the US, using some acquired data assets, to overcome common problems such as postal databases not having the degree of specificity you need for accurate, granular work. 


And really having the richest, most accurate geo-spatial information is a platform for layering on all sorts of other property metrics which can create value in the hands of end-users. So, very briefly, one example here is Automated Valuation Modelling or AVM.

Our first AVM for Office Space and Rental values for specific properties in the UK was launched in June 2021 and we are mid-way through our build of a second for UK residential properties. When we look at other AVMs in the market we tend to see, by comparison, much smaller data-sets behind the models, smaller coverage areas and MAPEs (Median Absolute Percentage Errors) averaging around 15%, whereas ours have better prediction accuracy. 

And I’ve talked there about German and UK markets, but the principles apply globally and the datasets we are ingesting are global e.g. satellite data used for building outline polygons and geo-coding. And we are having very developed conversations around applications with interested parties from the Middle East to Asia Pacific to Latin America to North America and Europe. 

Incumbent ability to compete 

MS: Great thanks Seth. And faced with this kind of tech firepower, I guess the obvious question here is can real estate firms or indeed any other user of the built environment actually match this themselves?  

SR:  So, I think the short answer is that it’s really challenging for them. Candidly, they are not really set up for it, because they are in the business of real estate, not the business of transformation, which is our core focus.  

RE5Q has painstakingly, over the past 3 and a half years or so, from the likes of Google, Facebook, CERN, NASA, Tesla and so on, put together a multi-disciplinary team of world-class engineers, programmers, quants, data scientists, artificial intelligence technologists and, of course, real estate domain experts you need to get this kind of thing done. Effectively, solving for what you rightly described earlier as one of the Big Data challenges, but without the distraction of, say, running a full-time real estate business. 

It’s very hard to take a toe-in-the-water approach to this. It took Google to build the pre-eminent internet search engine. Effectively, what we are talking about here is creating the  world’s largest repository and search engine for the built environment. 

So, I think transformational change is going to come from outside rather than within, but it will absolutely be a partnership between disruptors and incumbents to realise all the benefits. 

Versatility, purpose, partnership

MS: And what do you think Seth is going to be the best model for real estate industry participants to access and benefit from this? What does this partnership look like?

SR: So, the question on the right model for that partnership is an interesting one. We’ve debated it internally a good deal and we are already into several revenue-generating partnerships from property market intelligence to realising business rate efficiencies. In almost every case there are common themes:

  • Identify processes that are not data driven (of have poor data so require lots of human input); 
  • Identify/Capture/Create Content for new digital services using AI/ML and new data sets;
  • Create a JV that creates new enhanced end to end business processes that are more accurate, higher quality, data driven and automated; and
  • Deliver growth at scale, beyond what is possible with a human only approach.

The really exciting starting point here is that the data lakes and smart AI we’ve built and continue to add to and refine are incredibly versatile. They are a platform on which an almost limitless number of use cases can be built and delivered in tailored ways that work for specific users. On prem, cloud or hybrid. From interactive fast queries, dashboard reports, machine learning, ad-hoc analysis to enterprise search. 

Our bias (and ethos) is towards using our platform to create pragmatic, purposeful, real-world tools which “move the needle” and enable positive transformational change for small and micro-business as well as big global companies and which ultimately benefit the many not just the few. In practice, there will be a spectrum of ways we commercialize, but a reliable, simple to use, easy and affordable to licence application will likely feature, even if RE5Q’s tech is white-labelled, rather than us being centre stage. As I said, I guess this one is ultimately one for you as new CEO!

MS: Absolutely Seth and I think, in many ways, we have the quality problem of having such depth and versatility that the real strategic question is how we prioritize and successfully execute the right opportunities, with the right partners, at the right time.  

Let’s just finish with some crystal-ball gazing. Three to five years from now what does it look like?

SR: Big question. Here are some thoughts for you:

  • Smarter Cities: As I noted, the continued ramp up in renewables will create a far more dynamic power grid. AI and ML are already powering software to manage grids. Once we exceed 40% renewables (as a proportion of total energy use) these tools become a must-have, rather than better to have. Smarter grids mean cities will be smarter — not just clean-slate new-builds, but existing buildings. We’ll see low cost easy-to-deploy IoT sensors powered with Edge AI (i.e. AI in the relevant device) to enhance efficiency and create new revenue opportunities. 
  • Frictionless dealing: A lot of the opacity around transactional real estate – value, negotiation, process should be significantly reduced and ideally on blockchain or some other distributed ledger with smart-contract and virtual reality enabled in a way which makes the asset class more fluid and frictionless, more transparent and more accessible
  • Quantum: This is arguably the biggest change to come to computers since the transistor. It’s a complete game changer. It will not just replace what we have today. People think new tech is old tech but faster and cheaper and since the 1950s that has been broadly true. Quantum is not that. It’s not faster, it’s not cheaper. It’s a new class of compute that humanity has not seen before. It rewrites the book on almost everything we think we know. Look at storage — 700 binary digits (bits) are not enough to store a passport photo. But 700 quantum bits (qubits) are enough to store ALL THE DATA ON EARTH. Quantum also enables un-hackable communications. It is step-change technology, whose potential we are only beginning to understand. But in real estate many of the insights from Quantum can be leveraged. Look at the arbitrage in any market: a probability distribution. The Heisenberg Equation perfectly describes this and Quantum perfectly models this! At RE5Q, we are already using some of the ideas in this space to create new algorithms for conventional computers that leverage this type of opportunity, including across the ESG and AVM examples I mentioned earlier.
  • Cyber: The risks around Cyber are growing. As we connect more and more devices to more and more networks, the potential attack surfaces expand. Attackers routinely exploit vulnerabilities in IoT devices to gain a foothold from which to compromise large servers. At RE5Q, we believe security cannot be added as an afterthought, but rather it is a central and guiding principle in the way we design and build our own network and data firewalls. Our data is encrypted in motion and at rest. We also don’t subscribe to security by obscurity and only use military grade security standards. We also extend this to policy enforcement and licence compliance. By moving enforcement from humans to machine we create systems with no gap between policy and practice. We are also highly sensitized to privacy issues and concerns and use anonymisation and pseudonymisation in our AI where required to manage and/or prevent individual data disclosure.

MS: Great too to realise that RE5Q is positioned to play an important part across all those themes. 

It only remains for me to thank you Seth for letting me take a bit of a tour with you on what we have “under the bonnet.” 

Watch this space on how and where we enable those transformations we’ve touched on here.