school capacities / quotas #33

raphaellaude · 2023-03-23T15:39:19Z

A couple of suggestions (happy to work on a PR if you're open to it):

Not enough space: In overcrowded school districts, total capacity can be less than the total number of students. It would be worth it I think to build in some rules for how "excess" students get allocated.
Eligibility: Some schools may limit which students are eligible to attend (could be based on living within a certain area, test scores, etc.). It would be useful for this model to work if a school's preferences could include only a subset of all students.

This is the error thrown when you try to run the model w/o enough capacity:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[~\AppData\Local\Temp\ipykernel_40444\1300843094.py](https://file+.vscode-resource.vscode-cdn.net/c%3A/webtools_dev/deferred_acceptance_school_choice/examples/~/AppData/Local/Temp/ipykernel_40444/1300843094.py) in 
     13 # Run the algorithm
     14 schools_quota = {"A": 1, "B": 1, "C": 1}
---> 15 matches = deferred_acceptance(
     16     students_df=students_df,
     17     schools_df=strict_school_df,

[c:\webtools_dev\deferred_acceptance_school_choice\examples\..\deferred_acceptance\deferred_acceptance.py](file:///C:/webtools_dev/deferred_acceptance_school_choice/deferred_acceptance/deferred_acceptance.py) in deferred_acceptance(students_df, schools_df, schools_quota, verbose)
     42             if student not in unassigned_students:
     43                 school = available_school[student]
---> 44                 best_choice = students_df.loc[student][
     45                     students_df.loc[student].index.isin(school)
     46                 ].idxmin()

[c:\Users\Raphael.WXYSTUDIO\Anaconda3\lib\site-packages\pandas\core\series.py](file:///C:/Users/Raphael.WXYSTUDIO/Anaconda3/lib/site-packages/pandas/core/series.py) in idxmin(self, axis, skipna, *args, **kwargs)
   2332         nan
   2333         """
-> 2334         i = self.argmin(axis, skipna, *args, **kwargs)
   2335         if i == -1:
   2336             return np.nan

[c:\Users\Raphael.WXYSTUDIO\Anaconda3\lib\site-packages\pandas\core\base.py](file:///C:/Users/Raphael.WXYSTUDIO/Anaconda3/lib/site-packages/pandas/core/base.py) in argmin(self, axis, skipna, *args, **kwargs)
    717             # error: Incompatible return value type (got "Union[int, ndarray]", expected
    718             # "int")
--> 719             return nanops.nanargmin(  # type: ignore[return-value]
    720                 delegate, skipna=skipna
    721             )

[c:\Users\Raphael.WXYSTUDIO\Anaconda3\lib\site-packages\pandas\core\nanops.py](file:///C:/Users/Raphael.WXYSTUDIO/Anaconda3/lib/site-packages/pandas/core/nanops.py) in _f(*args, **kwargs)
     91             try:
     92                 with np.errstate(invalid="ignore"):
---> 93                     return f(*args, **kwargs)
     94             except ValueError as e:
     95                 # we want to transform an object array

[c:\Users\Raphael.WXYSTUDIO\Anaconda3\lib\site-packages\pandas\core\nanops.py](file:///C:/Users/Raphael.WXYSTUDIO/Anaconda3/lib/site-packages/pandas/core/nanops.py) in nanargmin(values, axis, skipna, mask)
   1140     values, mask, _, _, _ = _get_values(values, True, fill_value_typ="+inf", mask=mask)
   1141     # error: Need type annotation for 'result'
-> 1142     result = values.argmin(axis)  # type: ignore[var-annotated]
   1143     result = _maybe_arg_null_out(result, axis, mask, skipna)
   1144     return result

ValueError: attempt to get argmin of an empty sequence

Here's the error when you run the model with school preferences of variable length:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[~\AppData\Local\Temp\ipykernel_40444\2589139436.py](https://file+.vscode-resource.vscode-cdn.net/c%3A/webtools_dev/deferred_acceptance_school_choice/examples/~/AppData/Local/Temp/ipykernel_40444/2589139436.py) in 
      1 schools_preferences = {"A": [1, 1, 1], "B": [1, 3, 3, 1], "C": [2, 2, 2, 2]}
      2 
----> 3 students_df, schools_df = create_dataframes(
      4     students_list=students_list,
      5     students_preferences=students_preferences,

[c:\webtools_dev\deferred_acceptance_school_choice\examples\..\deferred_acceptance\utils.py](file:///C:/webtools_dev/deferred_acceptance_school_choice/deferred_acceptance/utils.py) in create_dataframes(students_list, students_preferences, schools_list, schools_preferences)
     52     students_df.index = schools_list
     53     students_df = students_df.transpose()
---> 54     schools_df = pd.DataFrame(schools_preferences)
     55     schools_df.index = students_list
     56 

[c:\Users\Raphael.WXYSTUDIO\Anaconda3\lib\site-packages\pandas\core\frame.py](file:///C:/Users/Raphael.WXYSTUDIO/Anaconda3/lib/site-packages/pandas/core/frame.py) in __init__(self, data, index, columns, dtype, copy)
    634         elif isinstance(data, dict):
    635             # GH#38939 de facto copy defaults to False only in non-dict cases
--> 636             mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
    637         elif isinstance(data, ma.MaskedArray):
    638             import numpy.ma.mrecords as mrecords

[c:\Users\Raphael.WXYSTUDIO\Anaconda3\lib\site-packages\pandas\core\internals\construction.py](file:///C:/Users/Raphael.WXYSTUDIO/Anaconda3/lib/site-packages/pandas/core/internals/construction.py) in dict_to_mgr(data, index, columns, dtype, typ, copy)
    500         # TODO: can we get rid of the dt64tz special case above?
    501 
--> 502     return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
    503 
    504 

[c:\Users\Raphael.WXYSTUDIO\Anaconda3\lib\site-packages\pandas\core\internals\construction.py](file:///C:/Users/Raphael.WXYSTUDIO/Anaconda3/lib/site-packages/pandas/core/internals/construction.py) in arrays_to_mgr(arrays, columns, index, dtype, verify_integrity, typ, consolidate)
    118         # figure out the index, if necessary
    119         if index is None:
--> 120             index = _extract_index(arrays)
    121         else:
    122             index = ensure_index(index)

[c:\Users\Raphael.WXYSTUDIO\Anaconda3\lib\site-packages\pandas\core\internals\construction.py](file:///C:/Users/Raphael.WXYSTUDIO/Anaconda3/lib/site-packages/pandas/core/internals/construction.py) in _extract_index(data)
    672             lengths = list(set(raw_lengths))
    673             if len(lengths) > 1:
--> 674                 raise ValueError("All arrays must be of the same length")
    675 
    676             if have_dicts:

ValueError: All arrays must be of the same length

The text was updated successfully, but these errors were encountered:

kyosek · 2023-03-23T22:09:49Z

Hi Raphael, thanks for the suggestions.
That's a good point - your suggestion regarding limited capacity makes sense. Maybe we can implement a rule that terminates the process once all schools are at full capacity. I'm happy for you to work on this and once it's done I can review it :)

wyattclarke · 2023-03-30T18:06:47Z

Super useful code. I ran into a similar problem applying it to more participants. My solution was to replace )[1:] on line 62 of deferred_acceptance.py with )[schools_quota[school]:].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

school capacities / quotas #33

school capacities / quotas #33

raphaellaude commented Mar 23, 2023

kyosek commented Mar 23, 2023

wyattclarke commented Mar 30, 2023

school capacities / quotas #33

school capacities / quotas #33

Comments

raphaellaude commented Mar 23, 2023

kyosek commented Mar 23, 2023

wyattclarke commented Mar 30, 2023