Python Streamlit - 过滤熊猫数据框而不重新运行整个脚本

Posted 2023-03-29

技术标签:

【中文标题】Python Streamlit - 过滤熊猫数据框而不重新运行整个脚本【英文标题】：Python Streamlit - filter pandas dataframe without rerun entire script 【发布时间】：2021-11-15 02:23:50 【问题描述】：

我有以下代码：

import streamlit as st
import pandas as pd

#define data
d = 'id': ['a', 'b', 'c'], 'data': [3, 4,6]
df = pd.DataFrame(data=d)

#create sidebar input
with st.sidebar.form("my_form"):
    a = st.slider('sidebar for testing', 5, 10, 9)
    calculate = st.form_submit_button('Calculate') 
 

if calculate:
    df['result'] = df['data'] + a 
    st.write(df)
    #no issues up to this point. When I move the slider in 10 the output in 16 stays on the web page

    ########debug############
    # I am trying to select an 'id' from the dropdown and use that to filter df, but when I select a value from the dropdown, 
    # the script runs again and the output disappears
    filter = st.selectbox('filter data', df['id'].unique())
    st.write(df[df['id'] == filter])

我想使用下拉菜单过滤 Pandas 数据框以选择我感兴趣的 id，但是当我使用下拉菜单时，代码会重新运行。

知道如何解决这个问题吗？

PS 我还尝试将整个计算包含在一个函数中并添加 @st.cache 装饰器，但没有成功。如果有人能告诉我它是如何完成的，我将不胜感激。

【问题讨论】：

【参考方案1】：

我能够通过不使用提交按钮来获得这种行为。只要有用户输入，Streamlit 就会从上到下重新运行脚本，因此表单提交也会重置。

d = 'id': ['a', 'b', 'c'], 'data': [3, 4, 6]
df = pd.DataFrame(data=d)

a = st.slider('sidebar for testing', 5, 10, 9)

df['result'] = df['data'] + a
st.write(df)

# Now this will show the filtered row in the dataframe as you change the inputs
filter = st.selectbox('filter data', df['id'].unique())
st.write(df[df['id'] == filter])

对于更复杂的工作流程，我会重构它并缓存加载的数据，但对于过滤你的数据框，这应该可以工作。

【讨论】：

【参考方案2】：

Streamlit 总是在每次用户提交时重新运行代码。但是，您可以使用st.session_state 解决此问题，它允许在重新运行之间共享状态。它的 api 很像标准的 python 字典。

这是您使用st.session_state 的示例：

import streamlit as st
import pandas as pd

#define data
d = 'id': ['a', 'b', 'c'], 'data': [3, 4,6]
df = pd.DataFrame(data=d)


#create sidebar input
with st.sidebar.form("my_form"):
    a = st.slider('sidebar for testing', 5, 10, 9)
    calculate = st.form_submit_button('Calculate')

# Initialization
if 'button_pressed' not in st.session_state:
    st.session_state['button_pressed'] = False

# Changes if calculated button is pressed  
if calculate:
    st.session_state['button_pressed'] = True

# Conditional on session_state instead
if st.session_state['button_pressed']:
    df['result'] = df['data'] + a
    st.write(df)
    #no issues up to this point. When I move the slider in 10 the output in 16 stays on the web page

    ########debug############
    # I am trying to select an 'id' from the dropdown and use that to filter df, but when I select a value from the dropdown,
    # the script runs again and the output disappears
    filter = st.selectbox('filter data', df['id'].unique())
    st.write(df[df['id'] == filter])

【讨论】：

以上是关于Python Streamlit - 过滤熊猫数据框而不重新运行整个脚本的主要内容，如果未能解决你的问题，请参考以下文章

熊猫：分组，过滤行，获取平均值

类型函数的熊猫过滤参数不可迭代[重复]

Python数据分析师使用低代码Streamlit实现Web数据可视化方法——入门篇

使用正则表达式过滤熊猫

按天过滤熊猫数据框