optimization - Subset data with dynamic conditions in R -
i have dataset of 2500 rows bank loans. each bank loan has outstanding amount , collateral type. (real estate, machine tools.. etc)
i need draw random selection out of dataset example sum of outstanding amount = 2.5million +-5% , maximum 25% loans same asset class.
i found function optim, asks function , looks constructed optimization portfolio of stocks, more complex. there easy way of achieving this?
i created sample data set illustrate question better:
dataset <- data.frame(balance=c(25000,50000,35000,40000,65000,10000,5000,2000,2500,5000) ,collateral=c("real estate","aeroplanes","machine tools","auto vehicles","real estate", "machine tools","office equipment","machine tools","real estate","auto vehicles"))
if want example 5 loans out of dataset sum of outstanding balance = 200.000 (with 10% margin) , not more 40% allowed same collateral type. (so maximum 2 out of 5 in example)
please let me know if additional information necessary. many thanks, tim
this function made works:
pick_records <- function(df,size,bal,collat,max.it) { <- 1 j <- 1 while ( == 1 ) { s_index <- sample(1:nrow(df) , size) print(s_index) output <- df[s_index,] out_num <- lapply(output,as.numeric) tot.col <- sum(as.numeric(out_num$collateral)) if (sum(out_num$balance) < (bal*1.1) & sum(out_num$balance) > (bal*0.9) & all( table(out_num$collateral)/size <= collat) ) { return(output) break } print(j) j <- j + 1 if ( j == max.it+1) { print('no solution found') break} } } > <- pick_records(dataset,5,200000,0.4,20) > balance collateral 3 35000 machine tools 7 5000 office equipment 4 40000 auto vehicles 5 65000 real estate 2 50000 aeroplanes
where df
dataframe, size
number of records want , max.it
number of maximum iterations find solution before returning no solution found
error, bal
limit balance , collat
same collateral. can change please.
let me know if don't part of it.
Comments
Post a Comment