Virtual conference program

Please note that all session times are listed below in EDT.

The videos from the virtual useR! 2025 can all be streamed on Youtube.

	Room	Title, abstract, and more info	Presenter(s)
Friday, August 1, 2025
Virtual
	Online	A Robust and Informative Application for viewing the dataframes in R More info In R programming, the View() function from the Utils package provides a basic interface for viewing the dataframe. The current R dataframe interface does not have features such as column selection, complex filtering, data-type of the variables, variable meta-data information, code reproducibility and download options. This presentation ([please click on this link to view my draft presentation][1]) will demonstrate a newly created feature-rich application that includes all above features. The Application was created using shiny modules for viewing and examining the dataframes from various statistical softwares such as SAS, Python. [1]: https://docs.google.com/presentation/d/1Mygbx-15iYyd8CVh7sgxtuhtteQ6hFZM/edit?usp=drivesdk&ouid=103296676447663833578&rtpof=true&sd=true Date and time: Fri, Aug 1, 2025 - Author(s): Madhan Kumar Nagaraji Keyword(s): statistical programming, clinical trials data, dataset interface, workflow Video recording available after conference: ✅	Madhan Kumar Nagaraji
	Online	A first look at Positron More info Positron is a next generation data science IDE built by the creators of RStudio. It has been available for beta testing for a number of months, and R users may have wondered if they should try it or if it will be a good fit for them. This new IDE is an extensible tool built to facilitate exploratory data analysis, reproducible authoring, and publishing data artifacts, and it is an IDE that supports but is not built only for R. How should an R user think about Positron, compared to the other options out there? In this talk, learn about how and why Positron is designed the way it is, what will feel familiar or new coming from other IDEs such as RStudio, and when (or if) people who use R should consider giving it a try. You’ll hear about different choices when it comes to defaults and ways of working, such as how to think about your projects or folders and how to manage multiple versions of R. You will also learn about new functionality for R users and package developers that we have never had before, like new approaches for managing R package tests and the ability to customize an IDE using extensions. If you are curious about Positron and how it fits into the R ecosystem, you’ll come away from this talk with more details about its capabilities and more clarity about whether it may be a good choice for you. Date and time: Fri, Aug 1, 2025 - Author(s): Julia Silge (Posit, PBC) Keyword(s): ide, workflow, tooling Video recording available after conference: ✅	Julia Silge (Posit PBC)
	Online	Automating workflows with webhooks and plumber in R More info Webhooks have brought to us new possibilities for automating workflows. With such, we can eliminate the need for manual interventions. In this talk, I will demonstrate how you can use plumber, an R package for building APIs, to create a webhook listener that triggers your workflows such as updating dashboards, triggering machine learning retraining, and other potential use cases. In the presentation, I will cover how to process payloads - using GitHub webhooks as an example, verifying authenticity using HMAC signatures, and implementing logging for tracking script execution, debugging, and monitoring. This talk will benefit researchers, data scientists, and developers who want to make their R workflows responsive to certain triggers. Date and time: Fri, Aug 1, 2025 - Author(s): CLINTON DAVID Keyword(s): automation, event-driven workflows, plumber api, github webhooks Video recording available after conference: ✅	CLINTON DAVID
	Online	Beyond Guesswork: How Econometric Models (MMMs) Guide Genius Marketing Decisions More info In a world where marketing budgets are scrutinised, customer journeys are more fragmented than ever, and every digital channel feels as though it’s taking over the world - how do you really know where to invest? Traditional measurement methods often fall short, leaving marketers with more questions than answers: Is TV a dead channel? Is last-click attribution selling your brand-building efforts short? Are you overinvesting in one channel while under-utilising another? In this talk, we will take you beyond surface-level metrics and into the world of econometrics in R: the gold standard for understanding marketing effectiveness. By applying time-series regression techniques, we can move past vanity metrics and gut feelings to uncover the real impact of marketing spend. Attribution is one of the biggest debates in marketing measurement, therefore, this talk will also explore whether first-click, last-click, or equal weightings really make sense - or whether a more nuanced approach is needed to reflect consumer behaviour. Next, we’ll explore how to quantify return on investment with rigor, determine the optimal allocation of budget across channels, visualise the relationship between channels, and understand diminishing returns to avoid wasted spend. All while breaking down key marketing acronyms, campaign types, and measurement approaches to ensure you leave with a clear understanding of how to apply these concepts in the real world. Finally, we will demonstrate our award-winning (IPA, 2024) example case study built in R with December19 Media Agency, for Xero Accounting UK, to bring our time-series regression analyses to life. If you’re looking for a session that moves beyond generic reporting and 6-figure marketing agency prices – this talk is made for you! Date and time: Fri, Aug 1, 2025 - Author(s): Abbie Brookes (Data Scientist @ Datacove), Jeremy Horne (Director @ Datacove); Abbie Brookes (Data Scientist @ Datacove) Keyword(s): marketing, statistical modelling, econometrics, measurement, regression Video recording available after conference: ✅	Abbie Brookes (Data Scientist @ Datacove) Jeremy Horne (Director @ Datacove)
	Online	CSV to Parquet: Managing data for multi-language analytics teams More info CSV is arguably the default data storage format for analytics teams. CSV format is advantageous for its simplicity. When the data size is small, it is easy to inspect the CSV data using a spreadsheet program. However, CSV files tend to become very slow for read and write operations at larger data sizes. Enter Apache Parquet > Apache Parquet is an open source, column-oriented data file format designed for efficient data storage and retrieval. It provides high performance compression and encoding schemes to handle complex data in bulk and is supported in many programming language and analytics tools. From Apache Parquet Documentation The talk would focus on an overview of the Apache Parquet format and advantages compared to the CSV format. I would also demonstrate reading and writing data in this format in R and show the interoperability with Apache Arrow. Further, I would demonstrate how this format would make life easier for polyglot teams that use R and Python. Finally, the session would end by mentioning key points to consider to decide their storage format. The participants would get introduced to Apache Parquet and Arrow and be able to better decide storage format for their workflows. Broad Agenda 1. Overview of the Apache Parquet format 2. Benchmarking with CSV 3. Reading and writing data in R and Python 4. Interoperability with Arrow and Pyarrow 5. Conclusion and Takeaways Date and time: Fri, Aug 1, 2025 - Author(s): Viswadutt Poduri Keyword(s): data processing, parquet, analytics, big data, storage Video recording available after conference: ✅	Viswadutt Poduri
	Online	Don’t Write Code Your Users Haven’t Asked For More info The fact that your code works doesn’t mean it’s useful to your users. Ensuring that code works correctly with unit-testing is well-established in R, but validating that we write the correct code—aligned with user needs—remains a challenge. In this talk, we’ll explore how Behavior-Driven Development helps collaborating with stakeholders by translating their needs into automated tests that check if our software satisfies them. You’ll walk away with an understanding of how to start practicing BDD in R with {cucumber}: a method of producing tests that describe what your software does, and makes tests easier to maintain as your software evolves. Date and time: Fri, Aug 1, 2025 - Author(s): Jakub Sobolewski Keyword(s): testing, behavior-driven development, test-driven development, efficient programming, gherkin Video recording available after conference: ✅	Jakub Sobolewski
	Online	Experimenting with LLM Applications in R More info Large language models (LLMs) are surprisingly easy to use and at their core, they’re just an API call away. But how do you go from calling a model to actually building something useful? In this talk, I’ll share my experiences creating and deploying LLM-powered applications in R. I’ll walk through different approaches I’ve tried, from incorporating LLMs into Shiny apps, improving my R code, and experimenting with different models and deployment options. Along the way, I’ll highlight what worked, what didn’t, and what I learned in the process. Whether you’re curious about integrating LLMs into your own projects or just want to see what’s possible, this session will offer a practical look at building with LLMs in R without the hype. Date and time: Fri, Aug 1, 2025 - Author(s): Nic Crane Keyword(s): automation, llms, ai Video recording available after conference: ✅	Nic Crane
	Online	From Copy-Paste Chaos to Reproducible Workflows: A Wet Lab Researcher’s Journey into R More info As a wet lab researcher, I used to struggle with fragmented data analysis workflows. I was taught: You do your experiments, you get your data, you copy-paste into separate software packages for descriptive statistics, visualisation, and documentation. I was constantly frustrated with data analysis: Change something early in the analysis? Go back and copy-paste. How did I analyse similar data sets previously while working at a different institute? Good luck opening that proprietary file format without that software and the license. Learning R transformed how I approach data, not just by replacing individual tools but reshaping my entire understanding of analysis. Beyond statistics, R introduced me to better data organisation, reproducible analysis, meaningful visualisation, and a community dedicated to improving data analysis and reporting. Working with R taught me more than any course on data analysis ever did. Now I use RMarkdown and Quarto daily to document and report my research. These tools allow me to standardise workflows, making my analyses reproducible and independent of proprietary software that might not be available in all research settings. Beyond improving my own work, these tools have become invaluable for guiding students, e.g. providing example workflows for common assays, and visualisations to help them better understand their data. In my talk, I will share my journey from chaotic spreadsheets to a reproducible, streamlined workflow. I will showcase the specific tools I use and how they have improved my research. Lastly, I will invite other wet lab researchers to discuss how these tools can help address reproducibility challenges in data analyses. Date and time: Fri, Aug 1, 2025 - Author(s): Anna Jaeschke Keyword(s): wet lab research, workflow, experimental research Video recording available after conference: ✅	Anna Jaeschke
	Online	From Data to Narrative: Interactive Storytelling with Shiny More info Data Storytelling transforms complex datasets into clear, engaging narratives, combining analysis, visualization, and storytelling to inspire action and facilitate decision-making. This session focuses on using Shiny to craft compelling stories through dynamic, interactive applications that turn raw data into impactful insights. Through live demonstrations, attendees will discover how Shiny (via Shiny-live in Quarto) bridges the gap between data and storytelling, empowering developers to create interactive dashboards that communicate complex ideas with clarity and impact. The session will highlight practical examples and best practices for building stories that resonate with diverse audiences. By the end of the session, participants will not only understand how to use Shiny to build interactive dashboards but also how to leverage these tools to create meaningful, audience-focused narratives. Shiny-live will be demonstrated as a key enabler of engaging, visually appealing, and interactive data storytelling. Date and time: Fri, Aug 1, 2025 - Author(s): Francisco Alfaro (USM) Keyword(s): quarto, shiny, data storytelling, interactive dashboards, visualization Video recording available after conference: ✅	Francisco Alfaro (USM)
	Online	Health care data harmonization using Shiny, clinal experts, and RDBMS More info In support of a large international, multi-site health care project that developed a new pediatric sepsis criteria, we created a pipeline to allow clinical experts to harmonize medications, observations, events, and laboratory measurements from electronic medical record extracts. This pipeline was instrumental in allowing the review and use of 2.2 billion rows/175 GB of source data. During the process of developing the sepsis criteria, we received multiple new data deliveries from each site, which required frequent review and re-harmonization of the provided source datasets. This harmonization pipeline consisted of multiple steps including conflating multiple source rows types to one harmonized type, performing source specific unit mapping, and performing value transformations. In an iterative process, clinical experts would identify rows for mapping, data scientists would run the harmonization pipeline, and then clinical experts would review mapped data using Shiny tools custom built for this project. Due to project and dataset size, we leveraged a range of tools including Google BigQuery, R, and make. After harmonization, the cleaned dataset was approximately 1.7 billion rows/155GB in size. This large amount of data required special considerations to perform acceptably. To keep Shiny responsive, to keep the server hosting our Shiny apps from crashing, and to prevent client browser crashes, we had to limit data being reviewed to at most a random sample of 50% of the larger data groupings. ![Application Screenshot][1] [1]: https://private-user-images.githubusercontent.com/9376248/421070526-1ced6919-a0f8-4dac-9561-afb442c48161.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDE2MzcyODksIm5iZiI6MTc0MTYzNjk4OSwicGF0aCI6Ii85Mzc2MjQ4LzQyMTA3MDUyNi0xY2VkNjkxOS1hMGY4LTRkYWMtOTU2MS1hZmI0NDJjNDgxNjEucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDMxMCUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAzMTBUMjAwMzA5WiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZWZlZmE2YTNmZjY4MzBiYThiYjkxYTljMjI3NmVmYmI5NDIwM2IxOTE4NjM2M2IwODExOGEwZWYwMjgyODgwYiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.zYshCSHQCOw9AKIrUIEb15KvmI2lqsls0s-wjxhxE2E Date and time: Fri, Aug 1, 2025 - Author(s): Seth Russell (University of Colorado Anschutz Medical Campus) Keyword(s): big data, shiny, healthcare, data harmonization, rdbms Video recording available after conference: ✅	Seth Russell (University of Colorado Anschutz Medical Campus)
	Online	Intracranial Pressure Monitor Placement Prediction in Children with Traumatic Brain Injury More info Traumatic brain injury causes approximately 2,200 deaths and 35,000 hospitalizations in U.S. children annually. Clinicians currently make decisions about placing an intracranial pressure (ICP) monitor in children with traumatic brain injury without the benefit of an accurate clinical decision support tool. In a prospective observational cohort study, we developed and validated models that predict placement of an ICP monitor. Patient data was gathered from multiple sources and discretized into 5-minute intervals. We divided data into four combinations of nurse documented and chart extracted input data, all including patient level and vital sign variables, and with inclusion or exclusion of data from brain computed tomography imaging reports and invasive blood pressure readings. Using R, we built machine learning models using logistic regression, support vector machines, generalized estimating equations, generalized additive models, and LSTMs. We trained each model with each combination of data. Optimal parameters were identified based on the highest F1. The best performing model, an LSTM deep learning model, achieved an F1 of 0.71 within 720 minutes of hospital arrival. The best non-neural network model, standard logistic regression, achieved an F1 of 0.36 within 720 minutes of hospital arrival. While non-RNN models did not achieve the best F1, their coefficient size and direction provide insight into factors predicting ICP monitor placement. Additionally, the generalized additive models allow for visualization and interpretation of the marginal impact (after integrating out the impact of the other variables) of a variable over time. Date and time: Fri, Aug 1, 2025 - Author(s): Seth Russell (University of Colorado Anschutz Medical Campus) Keyword(s): deep learning, machine learning, healthcare, decision making Video recording available after conference: ✅	Seth Russell (University of Colorado Anschutz Medical Campus)
	Online	Plot Twist: Adding Interactivity to the Elegance of ggplot2 with ggiraph More info One of the most common critiques of ggplot2 is its lack of built-in interactivity. While static plots are powerful for storytelling, interactive visualizations can enhance exploration, engagement, and accessibility. The ggiraph package finally provides a seamless way to add interactivity to ggplot2—enabling hover effects, tooltips, and clickable elements—while preserving the familiar layered approach and custom theming. In this talk, Tanya Shapiro and Cédric Scherer will demonstrate why ggiraph stands out among other solutions, such as plotly, and how it integrates effortlessly with ggplot2 and its extension ecosystem. We’ll walk through real-world examples, explore its key functionalities, and share practical tips for creating engaging and well-designed interactive visualizations with ggiraph. Whether you’re looking to make your research more engaging, enhance dashboards, or create interactive reports, this talk will provide a solid foundation for elevating your data storytelling with interactive visualizations. Date and time: Fri, Aug 1, 2025 - Author(s): Cédric Scherer (Independent Contractor), Tanya Shapiro (Independent Contractor) Keyword(s): data visualization, ggplot2, interactive charts, storytelling, dashboard Video recording available after conference: ✅	Cédric Scherer (Independent Contractor) Tanya Shapiro (Independent Contractor)
	Online	RDepot - 100% open source enterprise management of R and Python repositories More info RDepot is a solution for the management of R and Python package repositories in an enterprise environment. It allows to submit packages through a user interface or API and to automatically update and publish R and Python repositories. Multiple departments can manage their own repositories and different users can have different roles in the management of their packages. With continuous integration infrastructure for quality assurance on R and Python packages, package uploads can be automated. All configuration is declarative and RDepot can be set up as infrastructure as code, which is especially relevant in regulated contexts, since it makes validation activities much easier. Packages from publicly available R repositories such as CRAN and Bioconductor can be mirrored selectively in custom repositories for use behind a firewall, in internal networks and offline. Combined with Crane, authentication and fine-grained authorization (using OpenID Connect) can be configured per repository, which offers extra security when dealing with sensitive data or sensitive methodology. In this talk we will walk R users and developers through different features of RDepot and demonstrate how these can be useful in different scenarios. The logic of the different workflows will be explained and live demos will be given to see the open source solution in action. We will make sure to address needs ranging from small research groups sharing a handful of packages up to multinational companies managing their R (and Python) code across the globe. Date and time: Fri, Aug 1, 2025 - Author(s): Jonas Van Malder Keyword(s): package management, infrastructure, open source Video recording available after conference: ✅	Jonas Van Malder
	Online	Sharing data science artifacts across teams using Crane More info Do you have to share many data science artifacts across teams? This is a problem for many data science organizations and can now be solved using a novel open source product Crane (https://craneserver.net/). Crane hosts data science artifacts such as data analysis reports, documentation sites, or even packages and libraries. All of these data science artifacts are kept under strict authentication and authorization using modern protocols (OIDC). In this talk, we walk you through the different features of Crane and provide a live demo to explain the concepts. We will discuss its configuration file and demonstrate that authentication in Crane is fully declarative and allows for fine-grained configuration (at user-level, group-level, network-level or using SpEL) while still using an intuitive hierarchical tree that corresponds to the directory structure of the data. Next, we will show how artifacts can be accessed from or uploaded into Crane using the Crane API from R (e.g. to automate report updates, use data science artifacts in CI/CD) or using its customizable UI. Further, we zoom in on audit logs to track operations on all files (e.g. for GxP purposes) and detail the different storage backends (S3 and local file system). To ensure Crane can perform in high security settings the code base has been tested using integration tests reaching a high code coverage of more than 70%. With this talk we want to teach any R user and developer the essentials of Crane and how it can be used to share their data science artifacts. Date and time: Fri, Aug 1, 2025 - Author(s): Lucianos Lionakis (Open Analytics) Keyword(s): data sharing, data, automation, r, repository managment Video recording available after conference: ✅	Lucianos Lionakis (Open Analytics)
	Online	Shiny Policies: Customised Dashboards to Aid British Government Decisions More info Shiny dashboards are a powerful tool for visualising and interacting with data, but without thoughtful design, they can feel generic, clunky, or even inaccessible to key users. In this talk, we will explore how to take Shiny beyond it’s default appearance to create dashboards that are not only visually appealing but also highly usable, accessible, and seamlessly integrated into an organisation’s digital environment. To demonstrate this to our audience, we will share our open-source dashboard in collaboration with the British Department for Environment, Food and Rural Affairs (DEFRA). While the project required thorough data integration and analysis, one of the biggest challenges was ensuring the dashboard was not just functional but also visually cohesive, highly accessible, and intuitive for a broad range of users—including policymakers with varying levels of data literacy. We’ll start by discussing how to balance the line between over-simplifying and over-complicating data. Like most open-source data, there is a vast library of data, with little documentation on how to interpret it. Therefore, how to optimise open-source government underpins this talk – to ensure interactivity and efficient rendering techniques, so we can keep dashboards responsive and user-friendly. Next, we will be jump into customisation now that the foundations are in place - looking at how custom CSS and JavaScript can be leveraged to break free from the typical Shiny aesthetic, ensuring dashboards align with existing brand guidelines and user expectations. From typography and colour schemes to interactive elements, we’ll discuss techniques to create a polished, professional design that feels like a natural extension of an organisation’s existing web presence. Accessibility is another key factor in dashboard design. Many users—whether government policymakers, corporate stakeholders, or public audiences—have varying levels of data literacy, and a poorly designed interface can create barriers to insight. We’ll cover strategies for making dashboards more intuitive, including thoughtful navigation structures, tooltips, dynamic summaries, and alternative ways to display data for users with different needs. Additionally, we’ll explore best practices for ensuring compliance with accessibility standards, such as improving contrast, enabling keyboard navigation, and implementing screen reader-friendly elements. By the end of this session, you’ll have a clear understanding of how to design Shiny dashboards that are not just functional but genuinely enjoyable to use - helping your audience engage with data more effectively and make better-informed decisions with open-source data. Date and time: Fri, Aug 1, 2025 - Author(s): Abbie Brookes, Jeremy Horne (Director @ Datacove); Abbie Brookes Keyword(s): shiny apps, dashboard, environmental science, health science, decision-making, customisation Video recording available after conference: ✅	Abbie Brookes Jeremy Horne (Director @ Datacove)
	Online	ShinyProxy: easily deploy your Shiny apps More info ShinyProxy (https://shinyproxy.io/) is a 100% open source framework to deploy Shiny (and other) apps or web-based IDEs (like RStudio). Because of it’s flexibility, ShinyProxy is being used by both small startups and large enterprises. Although ShinyProxy was originally tailored towards hosting Shiny apps, it can host virtually any web app. Since ShinyProxy makes it easy to make reproducible apps, even when using multiple R versions, it’s often used by pharmaceutical companies. Nevertheless, it’s used by financial and engineering companies as well. ShinyProxy seamlessly integrates with your existing infrastructure (such as authentication providers and databases). The purpose of this talk is to give an introduction to ShinyProxy, explain the use-cases of ShinyProxy and it’s unique advantages over other solutions. No deep technical knowledge (e.g. of Docker or Linux) is needed to follow this talk, however, the talk will give you enough information to start using ShinyProxy yourselves. As usual, the development of ShinyProxy has continued, therefore we’ll also give a preview of upcoming features. Date and time: Fri, Aug 1, 2025 - Author(s): Tobia De Koninck (Open Analytics NV) Keyword(s): shiny,automation,docker,webapp Video recording available after conference: ✅	Tobia De Koninck (Open Analytics NV)
	Online	System Design for Shiny Developers: The Comprehensive Deployment Architecture More info The presentation discusses the aspects of application development and deployment that are spanning beyond the Shiny application itself: data storage and access, user management and authentication, observability and telemetry, multi-lingual microservices for complex task delegation, caching, and more. All these infrastructural elements can be created and used with the Free and open-source software such as the Docker engine, PostgreSQL, OpenLDAP, Shinyproxy, R language and various R packages, etc. The entire system of all services communicating with each other is facilitated by Docker-Compose and can be mapped on a single diagram. The diagram is presented during the talk to provide a clear understanding and a high-level overview of system design concepts. Author will also present practical examples, guidelines and tips on how to design and ship a complete solution from scratch. After the talk the audience may expect virtual handout materials provided through the means of a Github repository, which can be used as a starting template for their own projects. Date and time: Fri, Aug 1, 2025 - Author(s): Pavel Demin (Appsilon) Keyword(s): shiny, shinyproxy, system design, microservices, docker Video recording available after conference: ✅	Pavel Demin (Appsilon)
	Online	The Future of Asynchronous Programming in R More info Asynchronous programming can be a powerful paradigm, whereby computations are allowed to run concurrently without blocking the main session. It is an opportune time to survey the current landscape, as R infrastructure in this respect has matured significantly over recent years. Instead of running a script sequentially from top to bottom, logic that takes a long or unpredictable amount of time to complete may be offloaded to different R processes, possibly on other computers or in the cloud. In the meantime, the main session may be running constantly and non-interactively, performing operations in real time, synchronizing with these tasks only when necessary. This style of programming requires a very specific set of tooling. At the very base, there is an infrastructure layer involving key enabling packages such as later and mirai. It will be explained at a high level why these two packages together currently offer the most complete and efficient implementation of async for the R language. There are further tools which expand async functionality to cover specific needs, such as the watcher package for filesystem monitoring. There are then a range of tools built on top of these, bringing async capabilities to the end-user, such as the httr2 package for querying APIs and the ellmer package for interacting with LLMs. In addition to these existing tools, exciting developments in asynchronous programming are just around the corner. These will be previewed, together with speculation on what might be possible at some point in the future. Date and time: Fri, Aug 1, 2025 - Author(s): Charlie Gao (Posit, PBC) Keyword(s): asynchronous programming, distributed computing, parallel computing, open source tools Video recording available after conference: ✅	Charlie Gao (Posit PBC)
	Online	Thinking Inside the {box}: A Structured Approach for Full-Stack App Development More info As Shiny applications scale, maintaining clean structure, managing dependencies, and ensuring long-term maintainability become increasingly challenging. The {box} package modernizes R’s approach to modularization, while {rhino} provides a structured framework for building robust Shiny apps. Together, they offer a structured and scalable workflow for Shiny development. In this talk we will explore how to leverage {box}’s modularity also for API development, using a structured approach to manage routers, endpoints, filters and error handlers. This workflow takes advantage of the programmatic usage of {plumber}, as an alternative for the annotation-based approach. To understand these concepts in a real-world scenario, this talk will present a case study of a Shiny application that integrates {box} for modular design and {plumber} for structured API development. We will walk through key architectural decisions, demonstrate how modularization improves maintainability, and explore how this approach streamlines both Shiny and API development. This will help attendees gain actionable insights they can apply to their own projects. Date and time: Fri, Aug 1, 2025 - Author(s): Samuel Enrique Calderon Serrano (Appsilon, R Shiny developer) Keyword(s): modules, shiny, box, api, production Video recording available after conference: ✅	Samuel Enrique Calderon Serrano (Appsilon R Shiny developer)
	Online	Transforming Public Health Data Management: From Individual Use to Scalable Workflows with R More info The Health Information and Statistics Office within the Ministry of Health of Buenos Aires, Argentina, faced some key and unexpected challenges in its first year as an organization. As a small, 10-person team formed in 2019 building slow-paced data products with self-imposed goals, such as dashboards, they were hit with difficult tasks to be performed under pressure such as managing information and statistics workflows during a pandemic for a city with 3M inhabitants and serving hundreds of physicians and decision-makers of the public sector with almost real-time information. Over the years, this interdisciplinary team has tripled in size and has played a key role in high-impact strategic data science projects. These include developing data science solutions for extracting information from free-text data, creating complex algorithms for processing data from the city’s Electronic Health Records, and implementing large-scale cost recovery initiatives in the healthcare system by cross-referencing massive datasets and generating +35k rendered documents per week shared with key agencies in the city. To fulfill these objectives, the team has built a robust infrastructure and a wide range of digital products—all within the R ecosystem. The talk will cover the strategies, tools, and lessons learned in building efficient and reproducible data workflows in the public sector in a context of very limited resources, and we’ll explore how R has been fundamental in transitioning from individual analyses to scalable, automated workflows. Date and time: Fri, Aug 1, 2025 - Author(s): María Cristina Nanton (University of Buenos Aires) Keyword(s): public sector, data science, data mining, workflows, city management Video recording available after conference: ✅	María Cristina Nanton (University of Buenos Aires)
	Online	pkgdocs: a modular R package site generator More info `pkgdocs` is a new R package to generate package documentation as markdown from an R source package. Compared to other tools, `pkgdocs` is not focused on generating a static website directly, but rather pages that can be included in a larger documentation site. A common pattern in big projects is to modularize development in several R packages. By just generating markdown and not a finished static site, combining documentation of multiple packages is made easier. `pkgdocs` was made to work well with Hugo and the Docsy theme, but the markdown output should also be usable with other markdown-based static site generators with minor changes. Date and time: Fri, Aug 1, 2025 - Author(s): Daan Seynaeve; Anne-Katrin Hess Keyword(s): package, documentation, markdown, stateic site generator Video recording available after conference: ✅	Daan Seynaeve
Virtual Lightning
	Online	# Gen AI-Powered Shiny Dashboard for Financial Collections More info Tracking collections performance at a granular level is crucial for financial institutions. Our Shiny-based Collection Dashboard, powered by Gen AI, transforms the way business teams interact with data. The dashboard monitors key metrics like Bounce rate, First EMI Bounce, Current resolution etc., with multi-level filtering by zone, state, region, and branch. To enhance usability, we introduced: - Automated PPT Generation: Users can download a fully customized PowerPoint presentation for any combination of filters. The charts are further enhanced with summaries and actionable items for business by an LLM, providing key takeaways. - “Talk to Your Data” (Text2SQL): Business teams can query the data in natural language—e.g., “Which zone had the highest bounce rate this month?”—and receive instant, downloadable reports. By integrating Gen AI, we’ve significantly reduced business teams’ dependency on analytics for day-to-day data needs, empowering them with self-serve insights at scale. Date and time: Fri, Aug 1, 2025 - Author(s): Arnav Chauhan (Cholamandalam Investment and Finance Company Ltd.), Sreeram R (Cholamandalam Investment and Finance Company Ltd); Arnav Chauhan (Cholamandalam Investment and Finance Company Ltd.) Keyword(s): gen ai, r shiny , financial data, ppt generation, text2sql Video recording available after conference: ❌	Arnav Chauhan (Cholamandalam Investment and Finance Company Ltd.) Sreeram R (Cholamandalam Investment and Finance Company Ltd)
	Online	Ecotourism: An R Data Package for Exploring Australian Wildlife, Tourism, and Climate More info The ecotourism R package provides clean, ready-to-use datasets to explore the interplay between wildlife occurrences, domestic tourism, and weather across Australia. Inspired by data packages like nycflights13, it includes records of four widely observed Australian organisms, regional tourism counts, and matching weather data, all curated to facilitate reproducible analyses. The package supports hands-on learning in data visualization, time series, and spatial analysis. Comprehensive vignettes guide users through the data sourcing and cleaning process, promoting transparency and reproducibility. Ideal for teaching and research, ecotourism helps bring real-world ecological and tourism patterns into the R ecosystem. Date and time: Fri, Aug 1, 2025 - Author(s): Javad Vahdat Atashgah (Affiliation TBD); Dianne Cook (Monash University, Australia), Lyn Cook (Monash University, Australia) Keyword(s): GSoC, data science, statistics, ecology, statistical graphics, spatial analysis Video recording available after conference: ✅	Javad Vahdat Atashgah (Affiliation TBD)
	Online	GSoC: Debugging C Code in the R Dev Container More info The R Dev Container provides an environment for working on contributions to R: debugging issues, making changes to the source code, re-building R and testing the changes made. However, most functions in base R are wrappers to C code and up to the last version (0.3.0), the R Dev Container did not allow the use of debuggers such as LLDB for debugging C code. Since the R Dev Container may be run locally or on platforms such as GitHub Codespaces, it was necessary to identify a way to enable debugging tools without compromising security best practices. In this talk, I will present a solution developed as part of Google Summer of Code 2025. Our approach allows developers to seamlessly select and launch any R version built in the container for debugging with the VS Code CodeLLDB extension, providing a user-friendly interface for debugging C code in R. Date and time: Fri, Aug 1, 2025 - Author(s): Avinab Neogy; Heather Turner, Iain Emsley, Atharva Shirdhankar Keyword(s): GSoC, debugging, LLDB, VS Code, Dev Container, container security Video recording available after conference: ✅	Avinab Neogy
	Online	GSoC: Scaling Bayesian Nonparametric: C++ Acceleration of dirichletprocess More info This talk presents a comprehensive C++ acceleration of the dirichletprocess R package for Bayesian nonparametric clustering using Dirichlet Processes. The implementation achieves dramatic performance improvements while preserving the original R interface. The architecture provides complete C++ coverage for all major distributions (Normal, Beta, Exponential, Weibull, Multivariate Normal) with automatic fallback mechanisms. These optimizations enable analysis of datasets 10-100x larger than previously feasible, making Bayesian nonparametric methods accessible for real-world applications requiring scalable uncertainty quantification and automatic cluster discovery without computational constraints. Date and time: Fri, Aug 1, 2025 - Author(s): Priyanshu Tiwari (IIT Kanpur); Toby Hocking Keyword(s): GSoC, Bayesian nonparametric, Dirichlet process, C++ acceleration, MCMC optimization, Scalable clustering Video recording available after conference: ✅	Priyanshu Tiwari (IIT Kanpur)
	Online	GSoC: pandemonium: A Shiny GUI for Cluster Analysis With Interactive Visualization More info The pandemonium package brings together clustering and high-dimensional data visualization tools in an interactive Shiny application for fast and intuitive exploration of datasets. It employs hierarchical clustering, with interactive selection of tuning parameters such as distance metrics and linkage methods. Users can explore two connected spaces such as the input and output of a complex or black-box function. Linked brushing, clustering in one space and groupings based on categorical data or flags can be used to explore the connection between spaces through multiple views, including parallel coordinate plots, dimension reduction plots, and tour visualisations. This package is being built as a part of the Google Summer of Code (GSoC) program.] Date and time: Fri, Aug 1, 2025 - Author(s): Gabriel Mccoy (Monash University); Ursula Laa (BOKU University), German Valencia (Monash University) Keyword(s): GSoC, Shiny, cluster analysis, data visualization Video recording available after conference: ✅	Gabriel Mccoy (Monash University)
	Online	GSoC: spinebill: Diagnostic Tools for Projection Pursuit More info Update on GSoC project. [Abstract: Projection pursuit searches for interesting low-dimensional projections by optimising a criterion function. There are many different indexes available - ones that search for separation between classes, or outliers, or multimodality, skewness and deviations from standard distributions - and new ones being developed. This current work is motivated by developing new indexes based on scagnostics, which are designed to detect bivariate patterns such as skinny or stringy or clumpy. The spinebil package lets us explore projection pursuit indexes via tours and line plots. It helps to systematically diagnose whether they can actually distinguish structured patterns from noise and assess the index’s scale, stability and sensitivity. This new work adds new diagnostic functions, to help to tune index performance, and adds documentation. Ultimately, it will equip projection pursuit researchers with a general toolkit to test their new indexes for multivariate data exploration and dimensionality reduction.] Date and time: Fri, Aug 1, 2025 - Author(s): Tina Rashid Jafari (Monash University, Australia); Dianne Cook (Monash University, Australia), Ursula Laa (Boku University, Vienna), Jessica Leung (Monash University, Australia) Keyword(s): GSoC, Scagnostics, Data Science, High-dimensional Data, Statistical Graphics, Tours Video recording available after conference: ✅	Tina Rashid Jafari (Monash University Australia)
	Online	GeoLink R package More info GeoLink is an R package that assists users with merging publicly available geospatial indicators with georeferenced survey data. The georeferenced survey data can contain either latitude and longitude geocoordinates, or an administrative identifier with a corresponding shapefile. The procedure involves: Downloading geospatial indicator data, Shapefile tessellation, Computing Zonal statistics, and spatial joining of geospatial data with unit level data. The package, for example, can be used to link household characteristics measured in surveys with satellite-derived measures such as the average radiance of night-time light. The package can also calculate indicator values for each pixel covered by a tessellated grid in which a household is located. Finally, the package can be used to calculate zonal statistics for a user-defined shapefile (at native resolution or tessellated) and link the results to survey data. GeoLink complements the povmap and EMDI R packages to facilitate small area estimation with geospatial indicators. The latter two packages enable the estimation of regionally disaggregated indicators using small area estimation methods and includes tools for processing, assessing, and presenting the results. Date and time: Fri, Aug 1, 2025 - Author(s): Christopher Lloyd (University of Southampton (WorldPop)); Luciano Perfetti-Villa (University of Southampton (School of Geography and Environmental Science)) Keyword(s): social statistics, geospatial indicator, administrative unit, household survey, poverty mapping Video recording available after conference: ✅	Christopher Lloyd (University of Southampton (WorldPop))
	Online	R-evealing Insights: Forecasting Demand and Visualizing Data for Optimal Dermatology Clinic Operations More info Introduction: Efficiently managing patient demand and resources is crucial in dermatology, especially in India, where the doctor-to-patient ratio varies significantly, with a national average of approximately 1:834. This presentation explores using machine learning algorithms to forecast demand in dermatological clinics and developing an interactive visualization platform using ggplot2 in R. The goal is to help clinics in India and beyond optimize operations, improve patient satisfaction, and enhance resource allocation. Methods: We employ machine learning algorithms, including time series analysis and regression models, to analyze historical patient data. These algorithms identify trends and seasonal variations, enabling accurate demand forecasting. Additionally, we develop an interactive visualization platform using ggplot2 in R. This platform provides intuitive visualizations of clinic data, such as the busiest days, main types of cases, and other critical metrics. It also includes scenario testing features to simulate various staffing and resource allocation strategies. Results: The machine learning models successfully predict demand patterns, allowing clinics to anticipate busy periods and allocate resources effectively. The ggplot2-based visualization platform offers dynamic and customizable charts, making it easy for dermatologists to understand their clinic’s data. The scenario testing feature enables clinics to visualize the impact of different staffing and resource allocation strategies, facilitating data-driven decision-making. Conclusion: Combining machine learning forecasts with interactive visualizations empowers dermatological clinics to enhance efficiency, improve patient care, and manage resources effectively. This holistic approach ensures that clinics are well-prepared to meet patient demand, optimize operations, and deliver superior care. Date and time: Fri, Aug 1, 2025 - Author(s): Anjali Ancy; Subitchan . (SAVEETHA INSTITUTE OF MEDICAL AND TECHNICAL SCIENCES), Monisha M (SAVEETHA INSTITUTE OF MEDICAL AND TECHNICAL SCIENCES) Keyword(s): data visualisation, demand forecasting, data visualisation, ai in health, increasing patient involvement Video recording available after conference: ✅	Anjali Ancy
	Online	Shiny AI Regression and Prediction: Integration R Shiny with gemini.R Package More info Starting from the limitations of several statistical applications and the vast AI environment in responding to user prompts, the Shiny AI Regression and Prediction (SHARP) application was developed to build statistical applications, particularly in linear regression modeling, with automatic result interpretation using AI-generated prompts. The broad AI response environment is restricted with specific commands to ensure more controlled outputs. The development of this application utilizes two main packages: shiny and gemini.R. Additionally, several supporting packages, including readxl, ggplot2, olsrr, and reshape2, are used for data import, visualization, and modeling. Finally, the application is deployed using the shinyapps.io platform. Links Poster: https://drive.google.com/file/d/1LX85iqVOB1sKExLYLrdDf7UKgsEGiwDQ/view?usp=sharing Demonstration: https://drive.google.com/file/d/1gNrnEHW8--acgYRq0Ukl3UgVRhcHfNbM/view?usp=sharing Datasets: https://drive.google.com/drive/folders/1ClN-B8xKOc3y-AeT7KpsPUWDg9VkwFd9?usp=sharing Application: https://bqhcpg-joko0ade-nursiyono.shinyapps.io/Sharp/ Date and time: Fri, Aug 1, 2025 - Author(s): Joko Ade Nursiyono (BPS - Statistics of East Java, Indonesia) Keyword(s): statistical application, data science, data mining, shiny, ai, data, deploy, application, automation, insight Video recording available after conference: ✅	Joko Ade Nursiyono (BPS - Statistics of East Java Indonesia)
	Online	The wrong ways to run code in R More info Some of the features of R can be misused to do very confusing if not outright misleading things. We are going to explore a few of them, showing how they are used normally and how they are not intended to be used, in a style borrowing from Wat by Gary Bernhardt: - Some S3 classes store functions inside the objects and call them from their own methods, akin to having a virtual method table in C++. By changing the stored function, we can make `print(x)` perform arbitrary actions. - Source references are attributes that link executable objects to their source code. An invalid source reference will obscure the real source code of a function, making it look as if it does something different. - Use of lazy evaluation and dynamic bindings can make variable access execute code. - In addition to the normal evaluator that interprets the `LANGSXP` syntax trees, R contains a bytecode evaluator, which is faster. If the bytecode and the normal body of a function disagree in important ways, the results can be very baffling. Date and time: Fri, Aug 1, 2025 - Author(s): Ivan Krylov (Lomonosov Moscow State University) Keyword(s): serialization, evaluation Video recording available after conference: ✅	Ivan Krylov (Lomonosov Moscow State University)

Last updated 2025-08-10.