提供给离散比例ggplot2的连续值
Posted
技术标签:
【中文标题】提供给离散比例ggplot2的连续值【英文标题】:Continuous value supplied to discrete scale ggplot2 【发布时间】:2020-05-31 08:33:50 【问题描述】:当我尝试在 R 中绘制决策边界图时,遇到了一些问题,它返回了错误“提供给离散尺度的连续值”。我认为问题发生在 scale_colur_manual 中,但我不知道如何解决。下面附上代码。
library(caTools)
set.seed(123)
split = sample.split(df$Purchased,SplitRatio = 0.75)
training_set = subset(df,split==TRUE)
test_set = subset(df,split==FALSE)
# Feature Scaling
training_set[,1:2] = scale(training_set[,1:2])
test_set[,1:2] = scale(test_set[,1:2])
# Fitting logistic regression to the training set
lr = glm(formula = Purchased ~ .,
family = binomial,
data = training_set)
#Predicting the test set results
prob_pred = predict(lr,type = 'response',newdata = test_set[-3])
y_pred = ifelse(prob_pred > 0.5, 1, 0)
#Making the Confusion Matrix
cm = table(test_set[,3],y_pred)
cm
#Visualizing the training set results
library(ggplot2)
set = training_set
X1 = seq(min(set[, 1]) - 1, max(set[, 1]) + 1, by = 0.01)
X2 = seq(min(set[, 2]) - 1, max(set[, 2]) + 1, by = 0.01)
grid_set = expand.grid(X1, X2)
colnames(grid_set) = c('Age', 'EstimatedSalary')
prob_set = predict(lr, type = 'response', newdata = grid_set)
y_grid = ifelse(prob_set > 0.5, 1,0)
ggplot(grid_set) +
geom_tile(aes(x = Age, y = EstimatedSalary, fill = factor(y_grid)),
show.legend = F) +
geom_point(data = set, aes(x = Age, y = EstimatedSalary, color = Purchased),
show.legend = F) +
scale_fill_manual(values = c("orange", "springgreen3")) +
scale_colour_manual(values = c("red3", "green4")) +
scale_x_continuous(breaks = seq(floor(min(X1)), ceiling(max(X2)), by = 1)) +
labs(title = "Logistic Regression (Training set)",
ylab = "Estimated Salary", xlab = "Age")
【问题讨论】:
如果您包含一个简单的reproducible example 以及可用于测试和验证可能的解决方案的示例输入,则更容易为您提供帮助。删除重现问题所不需要的任何代码。 您的错误不是来自scale_color_manual
,而是来自scale_x_continuous
。如果我不得不猜测,我会说您的变量“Age”不是数字格式,或者您的 X1 和 X2 也不是数字格式(但我猜测第一个选项)。正如 MrFlick 所要求的,您应该提供一个可重现的数据集示例。
【参考方案1】:
您的Purchased
变量是一个因素吗?如果不是,它必须是。试试这个:
grid_set %>%
mutate(Purchased=factor(Purchased)) %>%
ggplot() +
geom_tile(aes(x = Age, y = EstimatedSalary, fill = factor(y_grid)),
show.legend = F) + ... # add the rest of your commands.
【讨论】:
以上是关于提供给离散比例ggplot2的连续值的主要内容,如果未能解决你的问题,请参考以下文章
ggplot2 + geom_point + 与大小成比例的气泡(错误“离散值提供给连续比例”)