各个行业的许多公司都必须在某个时刻做出战略决策,决定在哪里建设支持其运营的设施。例如:
这些战略决策非常重要,一旦做出并实施了改变的成本也很高。此外,这些决策对客户满意度和成本管理都有重大影响。在这个过程中需要考虑的一个关键因素是公司计划服务的客户的位置。
客户分配问题与设施选址问题密切相关,该问题涉及设施(从一组候选地点)的最佳布置,以最小化公司设施与客户之间的距离。当设施有无限容量时,客户被假定由最近的设施提供服务。
如果认为客户数量太大,可以将客户分组到集群中。然后,可以使用集群中心来代替单独的客户位置。此预处理假定属于给定集群的所有客户将由分配给该集群的设施提供服务。这个任务可以使用k-means算法,它的目的是将$n$对象划分为$k$个同的和不重叠的集群。
设置限制: 设施数量不能超出设施数量的最大值 $$ \begin{equation} \sum_{j}\text{select}_j \leq \text{max_facilities} \end{equation} $$
设施必须是开启的: 一个开启的设施 $i$ 只能被匹配在一个已经顾客集群 $j$ 上
参考资料
%matplotlib inline
import random
import matplotlib.pyplot as plt
import numpy as np
from sklearn.cluster import MiniBatchKMeans
seed = 10101
num_customers = 50000 # 顾客数量
num_candidates = 50 # 备选设施地点数量
max_facilities = 8 # 最多的设施数量
num_clusters = 1000 # 集群数量
num_gaussians = 10 # 高斯值
threshold = 0.99 # 最远距离临界值
random.seed(seed)
customers_per_gaussian = np.random.multinomial(num_customers,
[1/num_gaussians]*num_gaussians)
customer_locs = []
for i in range(num_gaussians):
# 中心的备选值位于 [-0.5, 0.5]
center = (random.random()-0.5, random.random()-0.5)
customer_locs += [(random.gauss(0,.1) + center[0], random.gauss(0,.1) + center[1])
for i in range(customers_per_gaussian[i])]
# each candidate coordinate in [-0.5, 0.5]
facility_locs = [(random.random()-0.5, random.random()-0.5)
for i in range(num_candidates)]
print('First customer location:', customer_locs[0])
First customer location: (0.33164437091949245, -0.2809884943538464)
kmeans = MiniBatchKMeans(n_clusters=num_clusters, init_size=3*num_clusters,
random_state=seed).fit(customer_locs)
memberships = list(kmeans.labels_)
centroids = list(kmeans.cluster_centers_) # 每个集群的中心
weights = list(np.histogram(memberships, bins=num_clusters)[0]) # 每个集群顾客数量
print('First cluster center:', centroids[0])
print('Weights for first 10 clusters:', weights[:10])
First cluster center: [ 0.2876316 -0.22176323] Weights for first 10 clusters: [63, 48, 44, 50, 29, 32, 57, 51, 44, 53]
def dist(loc1, loc2):
return np.linalg.norm(loc1-loc2, ord=2) # 欧氏距离
pairings = {(cluster,facility):dist(centroids[cluster],facility_locs[facility])
for cluster in range(num_clusters)
for facility in range(num_candidates)
if dist(facility_locs[facility], centroids[cluster]) < threshold}
print("Number of viable pairings: {0}".format(len(pairings.keys())))
Number of viable pairings: 48348
import pulp as lp
m = lp.LpProblem(name="Facility_location",sense=lp.LpMinimize)
I = range(num_clusters) # 集群索引
J = range(num_candidates) # 备选设施索引
# 决策变量:设施选址
select = lp.LpVariable.dicts(name='select',indexs=J,cat='Binary')
assign = lp.LpVariable.dicts(name='assign',indexs=pairings.keys(),cat='Binary' )
# 目标函数
# 0. 总距离
obj = lp.lpSum(weights[c]
*pairings[c,f]
*assign[c,f]
for c,f in pairings.keys())
m+=obj,'Objective_function'
# 1. 设施限制
m += lp.lpSum(select[j] for j in J)<= max_facilities, "Facility_limit"
# 2. 设施开启
for c,f in pairings.keys():
m += assign[c,f] <= select[f]
# 3. 最近设施
for cluster in I:
m += lp.lpSum(assign[c,f] for c,f in pairings.keys() if c==cluster)==1
m.solve()
1
plt.figure(figsize=(8,8), dpi=150)
plt.scatter(*zip(*customer_locs), c='Pink', s=0.5)
plt.scatter(*zip(*centroids), c='Red', s=10)
plt.scatter(*zip(*facility_locs), c='Green', s=10)
assignments = [p for p in pairings if lp.value(assign[p]) > 0.5]
for p in assignments:
pts = [facility_locs[p[1]], centroids[p[0]]]
plt.plot(*zip(*pts), c='Black', linewidth=0.1)
lp.value(m.objective)
6707.723978486067