Working-with-enrolments
Working-with-enrolments.Rmd
library(retention.helpers)
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 4.2.2
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
#> ✔ ggplot2 3.4.0 ✔ purrr 1.0.0
#> ✔ tibble 3.1.8 ✔ dplyr 1.0.10
#> ✔ tidyr 1.2.1 ✔ stringr 1.5.0
#> ✔ readr 2.1.3 ✔ forcats 0.5.2
#> Warning: package 'ggplot2' was built under R version 4.2.2
#> Warning: package 'tibble' was built under R version 4.2.2
#> Warning: package 'tidyr' was built under R version 4.2.2
#> Warning: package 'readr' was built under R version 4.2.2
#> Warning: package 'purrr' was built under R version 4.2.2
#> Warning: package 'dplyr' was built under R version 4.2.2
#> Warning: package 'stringr' was built under R version 4.2.2
#> Warning: package 'forcats' was built under R version 4.2.2
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
Overview
Enrolments come in two levels at Charles Sturt University; course and
subject. Course enrolments refer to the students’ enrolment in a
particular program of study, such as a Bachelor of Nursing. These are
stored in the student_course
table. Subject level
enrolments are stored in the enrolments
table, and
correspond to the enrolment records for units of study in a particular
session. These should match to the academic
table that
stores the grades for each subject taken (if the student remained past
census date).
Quirks
The enrolments
tables also includes other
pseudo-subjects, that are place holders for a Leave of Absence. These
‘subjects’ have the code XLV0000 and should not be
treated like a normal subject. Subjects beginning with the code
SSS… might also be needed to be treated differently, as
they are free subjects and exhibit very different behaviour from
students.
Counting enrolments
For this example, we set up a toy version of the
student_course
and enrolments
tables. We
rename them to match the names used in the (private)
data.csu.retention
package.
student_course <- retention.helpers::toy_student_course
enrolments <- retention.helpers::toy_enrolments
glimpse(student_course)
#> Rows: 11
#> Columns: 11
#> $ id <chr> "Student1", "Student2", "Student3", "Student1"…
#> $ course <chr> "Bachelor of Nursing", "Bachelor of Nursing", …
#> $ course_code <chr> "4415NS", "4415NS", "4415NS", "4415NS", "4421F…
#> $ course_level <chr> "Bachelors Pass", "Bachelors Pass", "Bachelors…
#> $ course_faculty <chr> "FOSH", "FOSH", "FOSH", "FOSH", "FOSH", "FOSH"…
#> $ basis_of_admission <chr> "A VET award course or VET delivered enabling …
#> $ admit_session <dbl> 202130, 202130, 202130, 202130, 202230, 202160…
#> $ catalog_year <dbl> 2021, 2021, 2021, 2021, 2022, 2021, 2021, 2022…
#> $ last_registered_session <dbl> NA, 202260, NA, NA, NA, NA, NA, 202230, 202260…
#> $ points_att <dbl> 0, 192, 0, 0, 0, 0, 0, 72, 40, 0, 144
#> $ points_comp <dbl> 0, 176, 0, 0, 0, 0, 0, 72, 40, 0, 72
glimpse(enrolments)
#> Rows: 65
#> Columns: 6
#> $ id <chr> "Student1", "Student1", "Student2", "Student2", "Student…
#> $ session <dbl> 202160, 202160, 202230, 202230, 202230, 202260, 202130, …
#> $ subject <chr> "NRS160", "NRS163", "NRS312", "NRS312", "NRS313", "NRS32…
#> $ offering <chr> "NRS160_202160_B_D", "NRS163_202160_B_D", "NRS312_202230…
#> $ enrol_date <dttm> 2021-05-06, 2021-05-06, 2021-12-08, 2021-12-08, 2021-11…
#> $ withdraw_date <dttm> 2021-06-09, 2021-06-09, NA, 2021-12-14, NA, NA, NA, 202…
Looking at subject enrolments
We might want to look at the active enrolments in a particular session.
# gets all enrolments, and counts by student
enrolments |>
filter(session == 202230) |>
count(id, sort = T) |>
rename(n_subjects = n)
#> # A tibble: 3 × 2
#> id n_subjects
#> <chr> <int>
#> 1 Student2 3
#> 2 Student4 2
#> 3 Student5 2
# gets only those who remained in the subject, then counts by student
enrolments |>
filter(session == 202230,
is.na(withdraw_date) # exluding those that have withdrawn
) |>
count(id, sort = T) |>
rename(n_subjects = n)
#> # A tibble: 1 × 2
#> id n_subjects
#> <chr> <int>
#> 1 Student2 2
Joining subject enrolments with course enroments
We might also be interested in what course these students are in. The toy data is primarily for the Bachelor of Nursing, however joining directly to the student_course table can produce errors if a student has been in multiple courses.
enrolments |>
filter(session == 202230) |>
distinct(id, session) |>
inner_join(
student_course |>
select(id, course, course_code,
admit_session, last_registered_session))
#> Joining, by = "id"
#> # A tibble: 6 × 6
#> id session course cours…¹ admit…² last_…³
#> <chr> <dbl> <chr> <chr> <dbl> <dbl>
#> 1 Student2 202230 Bachelor of Nursing 4415NS 202130 202260
#> 2 Student4 202230 Bachelor of Nursing 4415NS 202130 NA
#> 3 Student4 202230 Bachelor of Nursing 4421NS 202230 202230
#> 4 Student5 202230 Bachelor of Business (with specialis… 2421BU 202260 202260
#> 5 Student5 202230 Bachelor of Social Science (Psycholo… 2421SS 202230 NA
#> 6 Student5 202230 Bachelor of Nursing 4415NS 202130 202130
#> # … with abbreviated variable names ¹course_code, ²admit_session,
#> # ³last_registered_session
We need to select the most appropriate course, based on the
admit_session
(when the student entered the course -
however be warned this is sometimes later than their first subjects in
the course!), and their last_registered_session
. Then it
might be necessary to ignore the course_code
field as there
can be multiple course codes for a single course.
students_in_202230 <-
enrolments |>
filter(session == 202230) |>
distinct(id, session) |>
inner_join(
student_course |>
select(id, course, course_code,
admit_session, last_registered_session)) |>
group_by(id, session) |>
filter(
# exclude any courses if already completed
last_registered_session >= session | is.na(last_registered_session),
# exclude if they entered the course after the session of interest
admit_session <= session
) |>
distinct(id, session, course) |>
ungroup()
#> Joining, by = "id"
students_in_202230
#> # A tibble: 3 × 3
#> id session course
#> <chr> <dbl> <chr>
#> 1 Student2 202230 Bachelor of Nursing
#> 2 Student4 202230 Bachelor of Nursing
#> 3 Student5 202230 Bachelor of Social Science (Psychology)