I have a huge dictionary of lists that needs filtering. Here's an example of its output:
d = {
'hate': [(2310, "Experiencer: 'like hours'", 212, 222),
(2310, "Experiencer: 'two'", 1035, 1038),
(2310, "Experiencer: 'Anakin'", 1560, 1566),
(2310, "Experiencer: ' '", 1619, 1620),
(2310, "Experiencer: 'Tatooine'", 1726, 1734),
(2310, "Experiencer: 'Anakin'", 1775, 1781),
(2310, "Experiencer: 'Master Qui-Gon'", 1863, 1877),
(2310, "Experiencer: 'half'", 1883, 1887),
(2310, "Experiencer: 'One'", 2114, 2117),
(2310, "Experiencer: 'Anakin'", 2180, 2186),
(2310, "Stimulus: 'One'", 2484, 2487),
(2310, "Stimulus: 'Anakin'", 2564, 2570),
(2310, "Stimulus: 'Padme'", 2739, 2744)],
'confirmation': [(4132, "Experiencer: 'like hours'", 212, 222),
(4132, "Experiencer: 'two'", 1035, 1038),
(4132, "Experiencer: 'Anakin'", 1560, 1566),
(4132, "Experiencer: ' '", 1619, 1620),
(4132, "Experiencer: 'Tatooine'", 1726, 1734),
(4132, "Experiencer: 'Anakin'", 1775, 1781),
(4132, "Experiencer: 'Master Qui-Gon'", 1863, 1877),
(4132, "Experiencer: 'half'", 1883, 1887),
(4132, "Experiencer: 'One'", 2114, 2117),
(4132, "Experiencer: 'Anakin'", 2180, 2186),
(4132, "Experiencer: 'One'", 2484, 2487),
(4132, "Experiencer: 'Anakin'", 2564, 2570),
(4132, "Experiencer: 'Padme'", 2739, 2744),
(4132, "Experiencer: 'Anakin'", 2782, 2788),
(4132, "Experiencer: ' '", 2818, 2819),
(4132, "Experiencer: 'centuries'", 3562, 3571),
(4132, "Experiencer: 'one'", 3585, 3588),
(4132, "Experiencer: 'Anakin'", 3679, 3685),
(4132, "Experiencer: 'Anakin Skywalker'", 3789, 3805),
(4132, "Experiencer: 'Obi-Wan'", 4014, 4021),
(4132, "Experiencer: 'Qui-Gon'", 4025, 4032),
(4132, "Experiencer: 'Qui-Gon's'", 4100, 4109),
(4132, "Stimulus: 'Anakin'", 4281, 4287),
(4132, "Stimulus: ' '", 4355, 4356),
(4132, "Stimulus: 'Anakin'", 4436, 4442)]}
Each key (one of them is hate
as stated above) has a number at the beginning of every list element. Here, it's: 2310
.
I would like to be able to print out the two elements of the list that have a number that is closest to that number, being the next biggest, and the next smallest.
Example output:
'hate': [(2310, "Experiencer: 'Anakin'", 2180, 2186),
(2310, "Stimulus: 'One'", 2484, 2487)]
because
(2310, "Experiencer: 'Anakin'", 2180, 2186)
has the number 2180
, which is the next smallest one when compared to 2310
and in return:
(2310, "Stimulus: 'One'", 2484, 2487)
has the number 2484
, which is the next biggest one when compared to 2310
I guess this needs a for
loop? How do I iterarte over the dictionary of lists, compare the first, self-repeating number with the first numbers of every line and return the ones closest, as mentioned above?
I hope my question is understandable enough...
Thanks in advance!
EDIT:
Goal would be to automate the process of going through the dictionary, and update it by filtering it.
The desired output of that dictionary would be something like this:
d = {
'hate': [(2310, "Experiencer: 'Anakin'", 2180, 2186),
(2310, "Stimulus: 'One'", 2484, 2487)],
'confirmation': [(4132, "Experiencer: 'Qui-Gon's'", 4100, 4109),
(4132, "Stimulus: 'Anakin'", 4281, 4287)],
...}
I also edited the above example of an output that I'm getting so far. It's a dictionary of lists
If your lists are already sorted, we can use bisect
to find the place between the "Experiencer" and "Status" entries:
from bisect import bisect
l=[(2310, "Experiencer: 'like hours'", 212, 222), (2310, "Experiencer: 'two'", 1035,1038), (2310, "Experiencer: 'Anakin'", 1560, 1566), (2310, "Experiencer: ' '", 1619, 1620), (2310, "Experiencer: 'Tatooine'", 1726, 1734), (2310, "Experiencer: 'Anakin'", 1775, 1781), (2310, "Experiencer: 'Master Qui-Gon'", 1863, 1877), (2310, "Experiencer: 'half'", 1883, 1887), (2310, "Experiencer: 'One'", 2114, 2117), (2310, "Experiencer: 'Anakin'", 2180, 2186), (2310, "Stimulus: 'One'", 2484, 2487), (2310, "Stimulus: 'Anakin'", 2564, 2570), (2310, "Stimulus: 'Padme'", 2739, 2744)]
right_index = bisect(l, (2310, "F")) # "F" comes between "Experiencer" and "Status"
lower, higher = l[right_index-1], l[right_index]
print(lower, higher, sep="\n")
# (2310, "Experiencer: 'Anakin'", 2180, 2186)
# (2310, "Stimulus: 'One'", 2484, 2487)
Then you can process your dictionary quite easily
from bisect import bisect
def get_boundary(l): # This assumes all lists in your dict have at least 2 items
if len(l) < 2:
return l
right_index = bisect(l, (l[0][0], "F"))
return [l[right_index-1], l[right_index]]
print({key: get_boundary(value) for key, value in d.items()})
produces
{'hate': [(2310, "Experiencer: 'Anakin'", 2180, 2186),
(2310, "Stimulus: 'One'", 2484, 2487)],
'confirmation': [(4132, "Experiencer: 'Qui-Gon's'", 4100, 4109),
(4132, "Stimulus: 'Anakin'", 4281, 4287)]
}
Use itertools.groupby
to group all like elements from each of the list and then sort them (based on absolute difference) and get the first 2 elements
>>> from itertools import groupby
>>>
>>> f = lambda t: t[0]
>>> {key:sorted(v, key=lambda t: abs(k-t[3]))[:2] for key,lst in d.items() for k,v in groupby(sorted(lst, key=f), f)}
{'confirmation': [(4132, "Experiencer: 'Qui-Gon's'", 4100, 4109), (4132, "Experiencer: 'Qui-Gon'", 4025, 4032)], 'hate': [(2310, "Experiencer: 'Anakin'", 2180, 2186), (2310, "Stimulus: 'One'", 2484, 2487)]}
This is the most vanilla, simple way to do it (that I could think of). There is a solution below that uses itertools
that is more elegant, but harder for a novice to understand.
If l
is the list pointed to by hate
in your dictionary:
target_num = l[0][0]
closest_smaller, closest_bigger = 0,0
closest_smaller_diff, closest_bigger_diff = float("inf"), float("-inf")
for element in l:
for num in (l[-2],l[-1]):
diff = target_num - num
if diff > 0 and diff < closest_smaller_diff:
closest_smaller = num
closest_smaller_diff = diff
if diff < 0 and diff > closest_bigger_diff:
closest_bigger = num
closest_bigger_diff = diff
print(closest_smaller, closest_bigger)
# let big_dict be the big list you start with
output_dict = {}
for key, value in big_dict.items():
# break the list into two lists, for those with third value greater
# and those with third value lesser/equal
higher_tuples = [i for i in value if i[2] > i[0]]
lower_tuples = [i for i in value if i[2] <= i[0]]
# Get the values from that list with
high_closest = min(higher_tuples, key=lambda x: x[2] - x[0])
low_closest = min(lower_tuples, key=lambda x: x[0] - x[2])
# bind them into an output
output_duct[key] = [high_closest, low_closest]
If you wanted you could bind it all together in one really big one-liner:
output_dict = {key: [min([i for i in value if i[2] > i[0]], key=lambda a: a[2] - a[0]), min([j for j in value if j[0] <= j[2]], key=lambda b: b[0] - b[2])] for key, value in big_dict.items()}
Perhaps the easiest way is
2310
would appear in this list; you should be between two elements. Those are the two elements you want.Here's a rather broad solution that might work for you:
For key in your_dict:
for lst in your_dict[key]:
#For every list in your dictionary,
best_fits = []
for item in lst:
#For every item in that list,
#If the item is a good fit, store it in best_fit.
first_number = tup[0] #Get the 0th element of the tuple
#The rest is up to you
MySQL 5 to 8 query migration (rewrite) - variables within expressions
Select title from one table if capacity in another is not zero
PHP laravel parse data using jQuery or combine them to single array
Not able to npm install “unlink 'H:\working_dir\node_modules\.mongoose.DELETE\lib\browser.js'”
I am running a curve fit in python that encountered the error, RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 1000Here is my code:
I have a dataframe and would like to assign a rank to each row in a groupFor example,