Virtual |
TBD |
A Robust and Informative Application for viewing the dataframes in R
More infoIn R programming, the View() function from the Utils package provides a basic interface for viewing the dataframe. The current R dataframe interface does not have features such as column selection, complex filtering, data-type of the variables, variable meta-data information, code reproducibility and download options. This presentation ([please click on this link to view my draft presentation][1]) will demonstrate a newly created feature-rich application that includes all above features. The Application was created using shiny modules for viewing and examining the dataframes from various statistical softwares such as SAS, Python. [1]: https://docs.google.com/presentation/d/1Mygbx-15iYyd8CVh7sgxtuhtteQ6hFZM/edit?usp=drivesdk&ouid=103296676447663833578&rtpof=true&sd=true
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Madhan Kumar Nagaraji
Keyword(s): statistical programming, clinical trials data, dataset interface, workflow
Video recording available after conference: ✅ |
Madhan Kumar Nagaraji |
TBD |
A first look at Positron
More infoPositron is a next generation data science IDE built by the creators of RStudio. It has been available for beta testing for a number of months, and R users may have wondered if they should try it or if it will be a good fit for them. This new IDE is an extensible tool built to facilitate exploratory data analysis, reproducible authoring, and publishing data artifacts, and it is an IDE that supports but is not built only for R. How should an R user think about Positron, compared to the other options out there? In this talk, learn about how and why Positron is designed the way it is, what will feel familiar or new coming from other IDEs such as RStudio, and when (or if) people who use R should consider giving it a try. You’ll hear about different choices when it comes to defaults and ways of working, such as how to think about your projects or folders and how to manage multiple versions of R. You will also learn about new functionality for R users and package developers that we have never had before, like new approaches for managing R package tests and the ability to customize an IDE using extensions. If you are curious about Positron and how it fits into the R ecosystem, you’ll come away from this talk with more details about its capabilities and more clarity about whether it may be a good choice for you.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Julia Silge (Posit, PBC)
Keyword(s): ide, workflow, tooling
Video recording available after conference: ✅ |
Julia Silge (Posit PBC) |
TBD |
Analyzing Census Data in R: Techniques and Applications
More infoThis talk provides an introduction to working with IPUMS Census American Community Survey (ACS) data in R, focusing on key techniques for data preparation, weighting, and sampling. Participants will gain a foundational understanding of how Census data is structured and learn how to apply statistical weights to create representative analyses. The talk also explores the role of Census data in artificial intelligence (AI) and machine learning (ML), highlighting its applications in model training, fairness assessments, and demographic insights. Finally, the course addresses the critical use of Census data in anti-discrimination frameworks, demonstrating how demographic techniques can help evaluate bias and promote equitable AI/ML outcomes. Through practical exercises and case studies, participants will develop essential skills for integrating Census data into AI-driven analyses with an emphasis on equity.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Joanne Rodrigues
Keyword(s): demography, frameworks, census data, equity ml/ai, anti-discrimination in ml/ai
Video recording available after conference: ✅ |
Joanne Rodrigues |
TBD |
Automating workflows with webhooks and plumber in R
More infoWebhooks have brought to us new possibilities for automating workflows. With such, we can eliminate the need for manual interventions. In this talk, I will demonstrate how you can use plumber, an R package for building APIs, to create a webhook listener that triggers your workflows such as updating dashboards, triggering machine learning retraining, and other potential use cases. In the presentation, I will cover how to process payloads - using GitHub webhooks as an example, verifying authenticity using HMAC signatures, and implementing logging for tracking script execution, debugging, and monitoring. This talk will benefit researchers, data scientists, and developers who want to make their R workflows responsive to certain triggers.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): CLINTON DAVID
Keyword(s): automation, event-driven workflows, plumber api, github webhooks
Video recording available after conference: ✅ |
CLINTON DAVID |
TBD |
Beyond Guesswork: How Econometric Models (MMMs) Guide Genius Marketing Decisions
More infoIn a world where marketing budgets are scrutinised, customer journeys are more fragmented than ever, and every digital channel feels as though it’s taking over the world - how do you really know where to invest? Traditional measurement methods often fall short, leaving marketers with more questions than answers: Is TV a dead channel? Is last-click attribution selling your brand-building efforts short? Are you overinvesting in one channel while under-utilising another? In this talk, we will take you beyond surface-level metrics and into the world of econometrics in R: the gold standard for understanding marketing effectiveness. By applying time-series regression techniques, we can move past vanity metrics and gut feelings to uncover the real impact of marketing spend. Attribution is one of the biggest debates in marketing measurement, therefore, this talk will also explore whether first-click, last-click, or equal weightings really make sense - or whether a more nuanced approach is needed to reflect consumer behaviour. Next, we’ll explore how to quantify return on investment with rigor, determine the optimal allocation of budget across channels, visualise the relationship between channels, and understand diminishing returns to avoid wasted spend. All while breaking down key marketing acronyms, campaign types, and measurement approaches to ensure you leave with a clear understanding of how to apply these concepts in the real world. Finally, we will demonstrate our award-winning (IPA, 2024) example case study built in R with December19 Media Agency, for Xero Accounting UK, to bring our time-series regression analyses to life. If you’re looking for a session that moves beyond generic reporting and 6-figure marketing agency prices – this talk is made for you!
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Abbie Brookes (Data Scientist @ Datacove), Jeremy Horne (Director @ Datacove); Abbie Brookes (Data Scientist @ Datacove)
Keyword(s): marketing, statistical modelling, econometrics, measurement, regression
Video recording available after conference: ✅ |
Abbie Brookes (Data Scientist @ Datacove) Jeremy Horne (Director @ Datacove) |
TBD |
CSV to Parquet: Managing data for multi-language analytics teams
More infoCSV is arguably the default data storage format for analytics teams. CSV format is advantageous for its simplicity. When the data size is small, it is easy to inspect the CSV data using a spreadsheet program. However, CSV files tend to become very slow for read and write operations at larger data sizes. Enter Apache Parquet > Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides high performance compression and encoding schemes to handle complex data in bulk and is supported in many programming language and analytics tools. From Apache Parquet Documentation The talk would focus on an overview of the Apache Parquet format and advantages compared to the CSV format. I would also demonstrate reading and writing data in this format in R and show the interoperability with Apache Arrow. Further, I would demonstrate how this format would make life easier for polyglot teams that use R and Python. Finally, the session would end by mentioning key points to consider to decide their storage format. The participants would get introduced to Apache Parquet and Arrow and be able to better decide storage format for their workflows. Broad Agenda 1. Overview of the Apache Parquet format 2. Benchmarking with CSV 3. Reading and writing data in R and Python 4. Interoperability with Arrow and Pyarrow 5. Conclusion and Takeaways
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Viswadutt Poduri
Keyword(s): data processing, parquet, analytics, big data, storage
Video recording available after conference: ✅ |
Viswadutt Poduri |
TBD |
Data Visualization for Exploratory Factor Analysis
More infoExploratory factor analysis (EFA) is routinely used by researchers to reduce dimensionality of data, and to form meaningful factors. While there are good guidelines on how to report the results, data visualization tools are rarely used in understanding the results of EFA. Good data visualization, especially in a multivariate framework with ordinal data, makes it easier for people interpret the results of the analysis better. This presentation introduces and demonstrates different data visualization techniques that can be used to illustrate the results of EFA and improve its interpretation. Advantages and disadvantages of each of these techniques are discussed. As EFA is oftentimes used on categorical data from survey research, apart from visualizations for the factor results, exploratory data visualization for categorical variables are also presented. Moreover, these graphical tools can be used for purposes other than EFA. Data visualization and analysis is performed in R using publicly available survey research data.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Nivedita Bhaktha (Indian Institute of Technology Kanpur)
Keyword(s): factor analysis, exploratory data analysis, dimension reduction, ordinal data, survey research
Video recording available after conference: ✅ |
Nivedita Bhaktha (Indian Institute of Technology Kanpur) |
TBD |
Don’t Write Code Your Users Haven’t Asked For
More infoThe fact that your code works doesn’t mean it’s useful to your users. Ensuring that code works correctly with unit-testing is well-established in R, but validating that we write the correct code—aligned with user needs—remains a challenge. In this talk, we’ll explore how Behavior-Driven Development helps collaborating with stakeholders by translating their needs into automated tests that check if our software satisfies them. You’ll walk away with an understanding of how to start practicing BDD in R with {cucumber}: a method of producing tests that describe what your software does, and makes tests easier to maintain as your software evolves.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Jakub Sobolewski
Keyword(s): testing, behavior-driven development, test-driven development, efficient programming, gherkin
Video recording available after conference: ✅ |
Jakub Sobolewski |
TBD |
Experimenting with LLM Applications in R
More infoLarge language models (LLMs) are surprisingly easy to use and at their core, they’re just an API call away. But how do you go from calling a model to actually building something useful? In this talk, I’ll share my experiences creating and deploying LLM-powered applications in R. I’ll walk through different approaches I’ve tried, from incorporating LLMs into Shiny apps, improving my R code, and experimenting with different models and deployment options. Along the way, I’ll highlight what worked, what didn’t, and what I learned in the process. Whether you’re curious about integrating LLMs into your own projects or just want to see what’s possible, this session will offer a practical look at building with LLMs in R without the hype.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Nic Crane
Keyword(s): automation, llms, ai
Video recording available after conference: ✅ |
Nic Crane |
TBD |
From Data to Narrative: Interactive Storytelling with Shiny
More infoData Storytelling transforms complex datasets into clear, engaging narratives, combining analysis, visualization, and storytelling to inspire action and facilitate decision-making. This session focuses on using Shiny to craft compelling stories through dynamic, interactive applications that turn raw data into impactful insights. Through live demonstrations, attendees will discover how Shiny (via Shiny-live in Quarto) bridges the gap between data and storytelling, empowering developers to create interactive dashboards that communicate complex ideas with clarity and impact. The session will highlight practical examples and best practices for building stories that resonate with diverse audiences. By the end of the session, participants will not only understand how to use Shiny to build interactive dashboards but also how to leverage these tools to create meaningful, audience-focused narratives. Shiny-live will be demonstrated as a key enabler of engaging, visually appealing, and interactive data storytelling.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Francisco Alfaro (USM)
Keyword(s): quarto, shiny, data storytelling, interactive dashboards, visualization
Video recording available after conference: ✅ |
Francisco Alfaro (USM) |
TBD |
Generating interesting high-dimensional data structures
More infoA high-dimensional dataset is where each observation is described by many features, or dimensions. Such a dataset might contain various types of structures that have complex geometric properties, such as nonlinear manifolds, clusters, or sparse distributions. We can generate data containing a variety of structures using mathematical functions and statistical distributions. Sampling from a multivariate normal distribution will generate data in an elliptical shape. Using a trigonometric function we can generate a spiral. A torus function can create a donut shape. High-dimensional data structures are useful for testing, validating, and improving algorithms used in dimensionality reduction, clustering, machine learning, and visualization. Their controlled complexity allows researchers to understand challenges posed in data analysis and helps to develop robust analytical methods across diverse scientific fields like bioinformatics, machine learning, and forensic science. Functions to generate a large variety of structures in high dimensions are organized into the R package cardinalR , along with some already generated examples.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Piyadi Gamage Jayani Lakshika (Monash University, Australia); Dianne Cook (Monash University, Australia), Paul Harrison (Monash University, Australia), Michael Lydeamore (Monash University, Australia), Thiyanga S. Talagala (University of Sri Jayewardenepura, Sri Lanka)
Keyword(s): high-dimensional data structures, mathematical functions, statistical distributions, geometrics
Video recording available after conference: ✅ |
Piyadi Gamage Jayani Lakshika (Monash University Australia) |
TBD |
Health care data harmonization using Shiny, clinal experts, and RDBMS
More infoIn support of a large international, multi-site health care project that developed a new pediatric sepsis criteria, we created a pipeline to allow clinical experts to harmonize medications, observations, events, and laboratory measurements from electronic medical record extracts. This pipeline was instrumental in allowing the review and use of 2.2 billion rows/175 GB of source data. During the process of developing the sepsis criteria, we received multiple new data deliveries from each site, which required frequent review and re-harmonization of the provided source datasets. This harmonization pipeline consisted of multiple steps including conflating multiple source rows types to one harmonized type, performing source specific unit mapping, and performing value transformations. In an iterative process, clinical experts would identify rows for mapping, data scientists would run the harmonization pipeline, and then clinical experts would review mapped data using Shiny tools custom built for this project. Due to project and dataset size, we leveraged a range of tools including Google BigQuery, R, and make. After harmonization, the cleaned dataset was approximately 1.7 billion rows/155GB in size. This large amount of data required special considerations to perform acceptably. To keep Shiny responsive, to keep the server hosting our Shiny apps from crashing, and to prevent client browser crashes, we had to limit data being reviewed to at most a random sample of 50% of the larger data groupings. ![Application Screenshot][1] [1]: https://private-user-images.githubusercontent.com/9376248/421070526-1ced6919-a0f8-4dac-9561-afb442c48161.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDE2MzcyODksIm5iZiI6MTc0MTYzNjk4OSwicGF0aCI6Ii85Mzc2MjQ4LzQyMTA3MDUyNi0xY2VkNjkxOS1hMGY4LTRkYWMtOTU2MS1hZmI0NDJjNDgxNjEucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDMxMCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAzMTBUMjAwMzA5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZWZlZmE2YTNmZjY4MzBiYThiYjkxYTljMjI3NmVmYmI5NDIwM2IxOTE4NjM2M2IwODExOGEwZWYwMjgyODgwYiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.zYshCSHQCOw9AKIrUIEb15KvmI2lqsls0s-wjxhxE2E
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Seth Russell (University of Colorado Anschutz Medical Campus)
Keyword(s): big data, shiny, healthcare, data harmonization, rdbms
Video recording available after conference: ✅ |
Seth Russell (University of Colorado Anschutz Medical Campus) |
TBD |
Intracranial Pressure Monitor Placement Prediction in Children with Traumatic Brain Injury
More infoTraumatic brain injury causes approximately 2,200 deaths and 35,000 hospitalizations in U.S. children annually. Clinicians currently make decisions about placing an intracranial pressure (ICP) monitor in children with traumatic brain injury without the benefit of an accurate clinical decision support tool. In a prospective observational cohort study, we developed and validated models that predict placement of an ICP monitor. Patient data was gathered from multiple sources and discretized into 5-minute intervals. We divided data into four combinations of nurse documented and chart extracted input data, all including patient level and vital sign variables, and with inclusion or exclusion of data from brain computed tomography imaging reports and invasive blood pressure readings. Using R, we built machine learning models using logistic regression, support vector machines, generalized estimating equations, generalized additive models, and LSTMs. We trained each model with each combination of data. Optimal parameters were identified based on the highest F1. The best performing model, an LSTM deep learning model, achieved an F1 of 0.71 within 720 minutes of hospital arrival. The best non-neural network model, standard logistic regression, achieved an F1 of 0.36 within 720 minutes of hospital arrival. While non-RNN models did not achieve the best F1, their coefficient size and direction provide insight into factors predicting ICP monitor placement. Additionally, the generalized additive models allow for visualization and interpretation of the marginal impact (after integrating out the impact of the other variables) of a variable over time.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Seth Russell (University of Colorado Anschutz Medical Campus)
Keyword(s): deep learning, machine learning, healthcare, decision making
Video recording available after conference: ✅ |
Seth Russell (University of Colorado Anschutz Medical Campus) |
TBD |
Plot Twist: Adding Interactivity to the Elegance of ggplot2 with ggiraph
More infoOne of the most common critiques of ggplot2 is its lack of built-in interactivity. While static plots are powerful for storytelling, interactive visualizations can enhance exploration, engagement, and accessibility. The ggiraph package finally provides a seamless way to add interactivity to ggplot2—enabling hover effects, tooltips, and clickable elements—while preserving the familiar layered approach and custom theming. In this talk, Tanya Shapiro and Cédric Scherer will demonstrate why ggiraph stands out among other solutions, such as plotly, and how it integrates effortlessly with ggplot2 and its extension ecosystem. We’ll walk through real-world examples, explore its key functionalities, and share practical tips for creating engaging and well-designed interactive visualizations with ggiraph. Whether you’re looking to make your research more engaging, enhance dashboards, or create interactive reports, this talk will provide a solid foundation for elevating your data storytelling with interactive visualizations.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Cédric Scherer (Independent Contractor), Tanya Shapiro (Independent Contractor)
Keyword(s): data visualization, ggplot2, interactive charts, storytelling, dashboard
Video recording available after conference: ✅ |
Cédric Scherer (Independent Contractor) Tanya Shapiro (Independent Contractor) |
TBD |
RDepot - 100% open source enterprise management of R and Python repositories
More infoRDepot is a solution for the management of R and Python package repositories in an enterprise environment. It allows to submit packages through a user interface or API and to automatically update and publish R and Python repositories. Multiple departments can manage their own repositories and different users can have different roles in the management of their packages. With continuous integration infrastructure for quality assurance on R and Python packages, package uploads can be automated. All configuration is declarative and RDepot can be set up as infrastructure as code, which is especially relevant in regulated contexts, since it makes validation activities much easier. Packages from publicly available R repositories such as CRAN and Bioconductor can be mirrored selectively in custom repositories for use behind a firewall, in internal networks and offline. Combined with Crane, authentication and fine-grained authorization (using OpenID Connect) can be configured per repository, which offers extra security when dealing with sensitive data or sensitive methodology. In this talk we will walk R users and developers through different features of RDepot and demonstrate how these can be useful in different scenarios. The logic of the different workflows will be explained and live demos will be given to see the open source solution in action. We will make sure to address needs ranging from small research groups sharing a handful of packages up to multinational companies managing their R (and Python) code across the globe.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Jonas Van Malder
Keyword(s): package management, infrastructure, open source
Video recording available after conference: ✅ |
Jonas Van Malder |
TBD |
Sharing data science artifacts across teams using Crane
More infoDo you have to share many data science artifacts across teams? This is a problem for many data science organizations and can now be solved using a novel open source product Crane (https://craneserver.net/). Crane hosts data science artifacts such as data analysis reports, documentation sites, or even packages and libraries. All of these data science artifacts are kept under strict authentication and authorization using modern protocols (OIDC). In this talk, we walk you through the different features of Crane and provide a live demo to explain the concepts. We will discuss its configuration file and demonstrate that authentication in Crane is fully declarative and allows for fine-grained configuration (at user-level, group-level, network-level or using SpEL) while still using an intuitive hierarchical tree that corresponds to the directory structure of the data. Next, we will show how artifacts can be accessed from or uploaded into Crane using the Crane API from R (e.g. to automate report updates, use data science artifacts in CI/CD) or using its customizable UI. Further, we zoom in on audit logs to track operations on all files (e.g. for GxP purposes) and detail the different storage backends (S3 and local file system). To ensure Crane can perform in high security settings the code base has been tested using integration tests reaching a high code coverage of more than 70%. With this talk we want to teach any R user and developer the essentials of Crane and how it can be used to share their data science artifacts.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Lucianos Lionakis (Open Analytics)
Keyword(s): data sharing, data, automation, r, repository managment
Video recording available after conference: ✅ |
Lucianos Lionakis (Open Analytics) |
TBD |
Shiny Policies: Customised Dashboards to Aid British Government Decisions
More infoShiny dashboards are a powerful tool for visualising and interacting with data, but without thoughtful design, they can feel generic, clunky, or even inaccessible to key users. In this talk, we will explore how to take Shiny beyond it’s default appearance to create dashboards that are not only visually appealing but also highly usable, accessible, and seamlessly integrated into an organisation’s digital environment. To demonstrate this to our audience, we will share our open-source dashboard in collaboration with the British Department for Environment, Food and Rural Affairs (DEFRA). While the project required thorough data integration and analysis, one of the biggest challenges was ensuring the dashboard was not just functional but also visually cohesive, highly accessible, and intuitive for a broad range of users—including policymakers with varying levels of data literacy. We’ll start by discussing how to balance the line between over-simplifying and over-complicating data. Like most open-source data, there is a vast library of data, with little documentation on how to interpret it. Therefore, how to optimise open-source government underpins this talk – to ensure interactivity and efficient rendering techniques, so we can keep dashboards responsive and user-friendly. Next, we will be jump into customisation now that the foundations are in place - looking at how custom CSS and JavaScript can be leveraged to break free from the typical Shiny aesthetic, ensuring dashboards align with existing brand guidelines and user expectations. From typography and colour schemes to interactive elements, we’ll discuss techniques to create a polished, professional design that feels like a natural extension of an organisation’s existing web presence. Accessibility is another key factor in dashboard design. Many users—whether government policymakers, corporate stakeholders, or public audiences—have varying levels of data literacy, and a poorly designed interface can create barriers to insight. We’ll cover strategies for making dashboards more intuitive, including thoughtful navigation structures, tooltips, dynamic summaries, and alternative ways to display data for users with different needs. Additionally, we’ll explore best practices for ensuring compliance with accessibility standards, such as improving contrast, enabling keyboard navigation, and implementing screen reader-friendly elements. By the end of this session, you’ll have a clear understanding of how to design Shiny dashboards that are not just functional but genuinely enjoyable to use - helping your audience engage with data more effectively and make better-informed decisions with open-source data.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Abbie Brookes, Jeremy Horne (Director @ Datacove); Abbie Brookes
Keyword(s): shiny apps, dashboard, environmental science, health science, decision-making, customisation
Video recording available after conference: ✅ |
Abbie Brookes Jeremy Horne (Director @ Datacove) |
TBD |
ShinyProxy: easily deploy your Shiny apps
More infoShinyProxy (https://shinyproxy.io/) is a 100% open source framework to deploy Shiny (and other) apps or web-based IDEs (like RStudio). Because of it’s flexibility, ShinyProxy is being used by both small startups and large enterprises. Although ShinyProxy was originally tailored towards hosting Shiny apps, it can host virtually any web app. Since ShinyProxy makes it easy to make reproducible apps, even when using multiple R versions, it’s often used by pharmaceutical companies. Nevertheless, it’s used by financial and engineering companies as well. ShinyProxy seamlessly integrates with your existing infrastructure (such as authentication providers and databases). The purpose of this talk is to give an introduction to ShinyProxy, explain the use-cases of ShinyProxy and it’s unique advantages over other solutions. No deep technical knowledge (e.g. of Docker or Linux) is needed to follow this talk, however, the talk will give you enough information to start using ShinyProxy yourselves. As usual, the development of ShinyProxy has continued, therefore we’ll also give a preview of upcoming features.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Tobia De Koninck (Open Analytics NV)
Keyword(s): shiny,automation,docker,webapp
Video recording available after conference: ✅ |
Tobia De Koninck (Open Analytics NV) |
TBD |
System Design for Shiny Developers: The Comprehensive Deployment Architecture
More infoThe presentation discusses the aspects of application development and deployment that are spanning beyond the Shiny application itself: data storage and access, user management and authentication, observability and telemetry, multi-lingual microservices for complex task delegation, caching, and more. All these infrastructural elements can be created and used with the Free and open-source software such as the Docker engine, PostgreSQL, OpenLDAP, Shinyproxy, R language and various R packages, etc. The entire system of all services communicating with each other is facilitated by Docker-Compose and can be mapped on a single diagram. The diagram is presented during the talk to provide a clear understanding and a high-level overview of system design concepts. Author will also present practical examples, guidelines and tips on how to design and ship a complete solution from scratch. After the talk the audience may expect virtual handout materials provided through the means of a Github repository, which can be used as a starting template for their own projects.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Pavel Demin (Appsilon)
Keyword(s): shiny, shinyproxy, system design, microservices, docker
Video recording available after conference: ✅ |
Pavel Demin (Appsilon) |
TBD |
The Future of Asynchronous Programming in R
More infoAsynchronous programming can be a powerful paradigm, whereby computations are allowed to run concurrently without blocking the main session. It is an opportune time to survey the current landscape, as R infrastructure in this respect has matured significantly over recent years. Instead of running a script sequentially from top to bottom, logic that takes a long or unpredictable amount of time to complete may be offloaded to different R processes, possibly on other computers or in the cloud. In the meantime, the main session may be running constantly and non-interactively, performing operations in real time, synchronizing with these tasks only when necessary. This style of programming requires a very specific set of tooling. At the very base, there is an infrastructure layer involving key enabling packages such as later and mirai. It will be explained at a high level why these two packages together currently offer the most complete and efficient implementation of async for the R language. There are further tools which expand async functionality to cover specific needs, such as the watcher package for filesystem monitoring. There are then a range of tools built on top of these, bringing async capabilities to the end-user, such as the httr2 package for querying APIs and the ellmer package for interacting with LLMs. In addition to these existing tools, exciting developments in asynchronous programming are just around the corner. These will be previewed, together with speculation on what might be possible at some point in the future.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Charlie Gao (Posit, PBC)
Keyword(s): asynchronous programming, distributed computing, parallel computing, open source tools
Video recording available after conference: ✅ |
Charlie Gao (Posit PBC) |
TBD |
Thinking Inside the {box}: A Structured Approach for Full-Stack App Development
More infoAs Shiny applications scale, maintaining clean structure, managing dependencies, and ensuring long-term maintainability become increasingly challenging. The {box} package modernizes R’s approach to modularization, while {rhino} provides a structured framework for building robust Shiny apps. Together, they offer a structured and scalable workflow for Shiny development. In this talk we will explore how to leverage {box}’s modularity also for API development, using a structured approach to manage routers, endpoints, filters and error handlers. This workflow takes advantage of the programmatic usage of {plumber}, as an alternative for the annotation-based approach. To understand these concepts in a real-world scenario, this talk will present a case study of a Shiny application that integrates {box} for modular design and {plumber} for structured API development. We will walk through key architectural decisions, demonstrate how modularization improves maintainability, and explore how this approach streamlines both Shiny and API development. This will help attendees gain actionable insights they can apply to their own projects.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Samuel Enrique Calderon Serrano (Appsilon, R Shiny developer)
Keyword(s): modules, shiny, box, api, production
Video recording available after conference: ✅ |
Samuel Enrique Calderon Serrano (Appsilon R Shiny developer) |
TBD |
Transforming Public Health Data Management: From Individual Use to Scalable Workflows with R
More infoThe Health Information and Statistics Office within the Ministry of Health of Buenos Aires, Argentina, faced some key and unexpected challenges in its first year as an organization. As a small, 10-person team formed in 2019 building slow-paced data products with self-imposed goals, such as dashboards, they were hit with difficult tasks to be performed under pressure such as managing information and statistics workflows during a pandemic for a city with 3M inhabitants and serving hundreds of physicians and decision-makers of the public sector with almost real-time information. Over the years, this interdisciplinary team has tripled in size and has played a key role in high-impact strategic data science projects. These include developing data science solutions for extracting information from free-text data, creating complex algorithms for processing data from the city’s Electronic Health Records, and implementing large-scale cost recovery initiatives in the healthcare system by cross-referencing massive datasets and generating +35k rendered documents per week shared with key agencies in the city. To fulfill these objectives, the team has built a robust infrastructure and a wide range of digital products—all within the R ecosystem. The talk will cover the strategies, tools, and lessons learned in building efficient and reproducible data workflows in the public sector in a context of very limited resources, and we’ll explore how R has been fundamental in transitioning from individual analyses to scalable, automated workflows.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): María Cristina Nanton (University of Buenos Aires)
Keyword(s): public sector, data science, data mining, workflows, city management
Video recording available after conference: ✅ |
María Cristina Nanton (University of Buenos Aires) |
TBD |
pkgdocs: a modular R package site generator
More infopkgdocs is a new R package to generate package documentation as markdown from an R source package. Compared to other tools, pkgdocs is not focused on generating a static website directly, but rather pages that can be included in a larger documentation site. A common pattern in big projects is to modularize development in several R packages. By just generating markdown and not a finished static site, combining documentation of multiple packages is made easier. pkgdocs was made to work well with Hugo and the Docsy theme, but the markdown output should also be usable with other markdown-based static site generators with minor changes.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Daan Seynaeve; Anne-Katrin Hess
Keyword(s): package, documentation, markdown, stateic site generator
Video recording available after conference: ✅ |
Daan Seynaeve |
TBD |
rosella: diagnosing and interpreting classification models
More infoUnderstanding the behavior of complex machine learning models has become a challenge in the modern day. Explainable AI (XAI) methods were introduced to provide insights into model predictions, however interpreting these explanations can be difficult without proper visualisation methods. To fill this gap we have built rosella, an R package offering an interactive Shiny app that visualizes model behavior in the data space alongside XAI explanations. Designed for developers, educators, and students, rosella makes model decisions more accessible and interpretable.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Janith Wanniarachchi (Monash University); Dianne Cook (Monash University), Kate Saunders (Monash University), Patricia Menendez (University of Melbourne), Thiyanga Talagala (University of Sri Jayewardenepura)
Keyword(s): high dimensional data, explainable ai, interactive tools, machine learning
Video recording available after conference: ✅ |
Janith Wanniarachchi (Monash University) |
Virtual Lightning |
TBD |
# Gen AI-Powered Shiny Dashboard for Financial Collections
More infoTracking collections performance at a granular level is crucial for financial institutions. Our Shiny-based Collection Dashboard, powered by Gen AI, transforms the way business teams interact with data. The dashboard monitors key metrics like Bounce rate, First EMI Bounce, Current resolution etc., with multi-level filtering by zone, state, region, and branch. To enhance usability, we introduced: - Automated PPT Generation: Users can download a fully customized PowerPoint presentation for any combination of filters. The charts are further enhanced with summaries and actionable items for business by an LLM, providing key takeaways. - “Talk to Your Data” (Text2SQL): Business teams can query the data in natural language—e.g., “Which zone had the highest bounce rate this month?”—and receive instant, downloadable reports. By integrating Gen AI, we’ve significantly reduced business teams’ dependency on analytics for day-to-day data needs, empowering them with self-serve insights at scale.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Arnav Chauhan (Cholamandalam Investment and Finance Company Ltd.), Sreeram R (Cholamandalam Investment and Finance Company Ltd); Arnav Chauhan (Cholamandalam Investment and Finance Company Ltd.)
Keyword(s): gen ai, r shiny , financial data, ppt generation, text2sql
Video recording available after conference: ❌ |
Arnav Chauhan (Cholamandalam Investment and Finance Company Ltd.) Sreeram R (Cholamandalam Investment and Finance Company Ltd) |
TBD |
Exploring Fun and Functional R Packages
More infoEveryone can use a bit of fun to improve our coding experience. While R is widely used for statistical analysis, it also has a creative and playful side. In this session, we’ll explore around 20 fun packages. Attendees will learn how to use packages like memer to create memes, emojifont to insert emojis into plots, wordcloud2 to generate interactive word clouds… By the end of the session, attendees will walk away with fresh ideas for integrating these tools into their daily workflows, whether for personal enjoyment or to create more engaging, impactful data visualizations. Preferred format: Lightning talk, open to talks or posters
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Joanna Chen (TikTok)
Keyword(s): r packages, r for fun
Video recording available after conference: ❌ |
Joanna Chen (TikTok) |
TBD |
Extending Shiny with React.js: Interactive Bubble Charts with nivo.bubblechart
More infoThe nivo.bubblechart package is an R interface to the nivo.rocks library, designed for creating interactive bubble charts in Shiny applications. Built on top of React.js and D3.js, nivo.rocks provides powerful and customizable visualizations that go beyond traditional R plotting libraries. This talk will demonstrate how nivo.bubblechart leverages the reactR package to seamlessly extend Shiny with React components, enabling highly interactive, dynamic, and responsive visualizations. The audience will gain insights into how reactR bridges the gap between R and JavaScript, allowing developers to integrate modern web technologies into their R applications. Through live examples and code snippets, this session will highlight the advantages of using React-powered widgets in Shiny and how they can enhance user experience with interactive graphics. Whether you’re an R developer exploring JavaScript or a Shiny user looking to extend your UI capabilities, this talk will provide practical takeaways to level up your Shiny dashboards. GitHub URL: https://github.com/DataRacerEdu/nivo.bubblechart
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Anastasiia Kostiv
Keyword(s): react.js shiny d3.js
Video recording available after conference: ✅ |
Anastasiia Kostiv |
TBD |
GeoLink R package
More infoGeoLink is an R package that assists users with merging publicly available geospatial indicators with georeferenced survey data. The georeferenced survey data can contain either latitude and longitude geocoordinates, or an administrative identifier with a corresponding shapefile. The procedure involves: Downloading geospatial indicator data, Shapefile tessellation, Computing Zonal statistics, and spatial joining of geospatial data with unit level data. The package, for example, can be used to link household characteristics measured in surveys with satellite-derived measures such as the average radiance of night-time light. The package can also calculate indicator values for each pixel covered by a tessellated grid in which a household is located. Finally, the package can be used to calculate zonal statistics for a user-defined shapefile (at native resolution or tessellated) and link the results to survey data. GeoLink complements the povmap and EMDI R packages to facilitate small area estimation with geospatial indicators. The latter two packages enable the estimation of regionally disaggregated indicators using small area estimation methods and includes tools for processing, assessing, and presenting the results.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Christopher Lloyd (University of Southampton (WorldPop)); Luciano Perfetti-Villa (University of Southampton (School of Geography and Environmental Science))
Keyword(s): social statistics, geospatial indicator, administrative unit, household survey, poverty mapping
Video recording available after conference: ✅ |
Christopher Lloyd (University of Southampton (WorldPop)) |
TBD |
R-evealing Insights: Forecasting Demand and Visualizing Data for Optimal Dermatology Clinic Operations
More infoIntroduction: Efficiently managing patient demand and resources is crucial in dermatology, especially in India, where the doctor-to-patient ratio varies significantly, with a national average of approximately 1:834. This presentation explores using machine learning algorithms to forecast demand in dermatological clinics and developing an interactive visualization platform using ggplot2 in R. The goal is to help clinics in India and beyond optimize operations, improve patient satisfaction, and enhance resource allocation. Methods: We employ machine learning algorithms, including time series analysis and regression models, to analyze historical patient data. These algorithms identify trends and seasonal variations, enabling accurate demand forecasting. Additionally, we develop an interactive visualization platform using ggplot2 in R. This platform provides intuitive visualizations of clinic data, such as the busiest days, main types of cases, and other critical metrics. It also includes scenario testing features to simulate various staffing and resource allocation strategies. Results: The machine learning models successfully predict demand patterns, allowing clinics to anticipate busy periods and allocate resources effectively. The ggplot2-based visualization platform offers dynamic and customizable charts, making it easy for dermatologists to understand their clinic’s data. The scenario testing feature enables clinics to visualize the impact of different staffing and resource allocation strategies, facilitating data-driven decision-making. Conclusion: Combining machine learning forecasts with interactive visualizations empowers dermatological clinics to enhance efficiency, improve patient care, and manage resources effectively. This holistic approach ensures that clinics are well-prepared to meet patient demand, optimize operations, and deliver superior care.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Anjali Ancy; Subitchan . (SAVEETHA INSTITUTE OF MEDICAL AND TECHNICAL SCIENCES), Monisha M (SAVEETHA INSTITUTE OF MEDICAL AND TECHNICAL SCIENCES)
Keyword(s): data visualisation, demand forecasting, data visualisation, ai in health, increasing patient involvement
Video recording available after conference: ✅ |
Anjali Ancy |
TBD |
Shiny AI Regression and Prediction: Integration R Shiny with gemini.R Package
More infoStarting from the limitations of several statistical applications and the vast AI environment in responding to user prompts, the Shiny AI Regression and Prediction (SHARP) application was developed to build statistical applications, particularly in linear regression modeling, with automatic result interpretation using AI-generated prompts. The broad AI response environment is restricted with specific commands to ensure more controlled outputs. The development of this application utilizes two main packages: shiny and gemini.R. Additionally, several supporting packages, including readxl, ggplot2, olsrr, and reshape2, are used for data import, visualization, and modeling. Finally, the application is deployed using the shinyapps.io platform. Links Poster: https://drive.google.com/file/d/1LX85iqVOB1sKExLYLrdDf7UKgsEGiwDQ/view?usp=sharing Demonstration: https://drive.google.com/file/d/1gNrnEHW8--acgYRq0Ukl3UgVRhcHfNbM/view?usp=sharing Datasets: https://drive.google.com/drive/folders/1ClN-B8xKOc3y-AeT7KpsPUWDg9VkwFd9?usp=sharing Application: https://bqhcpg-joko0ade-nursiyono.shinyapps.io/Sharp/
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Joko Ade Nursiyono (BPS - Statistics of East Java, Indonesia)
Keyword(s): statistical application, data science, data mining, shiny, ai, data, deploy, application, automation, insight
Video recording available after conference: ✅ |
Joko Ade Nursiyono (BPS - Statistics of East Java Indonesia) |
TBD |
The wrong ways to run code in R
More infoSome of the features of R can be misused to do very confusing if not outright misleading things. We are going to explore a few of them, showing how they are used normally and how they are not intended to be used, in a style borrowing from Wat by Gary Bernhardt: - Some S3 classes store functions inside the objects and call them from their own methods, akin to having a virtual method table in C++. By changing the stored function, we can make print(x) perform arbitrary actions. - Source references are attributes that link executable objects to their source code. An invalid source reference will obscure the real source code of a function, making it look as if it does something different. - Use of lazy evaluation and dynamic bindings can make variable access execute code. - In addition to the normal evaluator that interprets the LANGSXP syntax trees, R contains a bytecode evaluator, which is faster. If the bytecode and the normal body of a function disagree in important ways, the results can be very baffling.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Ivan Krylov (Lomonosov Moscow State University)
Keyword(s): serialization, evaluation
Video recording available after conference: ✅ |
Ivan Krylov (Lomonosov Moscow State University) |
TBD |
quickr: Translate R to Fortran for Improved Performance
More infoThis talk introduces ‘quickr’, an R package designed to make numerical R code faster by translating R functions to Fortran. While R code offers great flexibility, it often comes at the expense of performance, especially for computationally intensive tasks. To achieve better speed, users typically need to rewrite performance-critical code in compiled languages like C or Fortran, which adds complexity and creates maintenance overhead. Quickr simplifies this process by allowing users to add simple type declarations to their existing R functions, which enables quickr to then automatically translate the entire function into efficient Fortran routines. The presentation will demonstrate quickr in practical applications, with benchmarks showing performance improvements comparable to native C implementations. The talk will also cover current limitations, including supported data types and language features, and show how quickr can be easily integrated into existing R packages. Participants will learn how quickr can help improve their R code performance without significantly increasing development complexity or sacrificing the readability of their code.
Date and time: Fri, Aug 1, 2025 - TBD
Author(s): Tomasz Kalinowski (Posit, PBC)
Keyword(s): speed, hpc, numerical computing, type annotation, r syntax, declare(), fortran
Video recording available after conference: ✅ |
Tomasz Kalinowski (Posit PBC) |