HolyPotato čet 27.6.2019 13:24

Pozz, imam jedan problem kod sparka. Zna li netko kako li mogu navedeni postgresql query napisati uz pomoc RDD operacija tako da konacni rezultat bude isti kao kad ga pokrenem u postgresu. Znam da ponajprije moram koristiti transformations ali neznam kako napraviti join, dali je isto kao u postgresu ili. Rjesenje sam mislio napisati u pysparku.

 

SELECT Tournaments.TYear,Countries.Name,Max(Matches.MatchDate)- Min(Matches.MatchDate)AS LENGTH

FROM Tournaments,Countries,Hosts,Teams,Matches

WHERE Tournaments.TYear = Hosts.TYear AND Countries.Cid = Hosts.Cid AND(Teams.Tid = Matches.HomeTid OR Teams.Tid = Matches.VisitTid)AND date_part('year', Matches.MatchDate)::text LIKE(Tournaments.TYear ||'%')

GROUPBY Tournaments.TYear,Countries.Name

ORDERBY LENGTH,Tournaments.TYear ASC